Methods for identifying compounds that modulate disorders related to nitric oxide/ cGMP-dependent protein kinase signaling

ABSTRACT

The invention provides a method of identifying a compound that modulates a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal. The method consists of (a) administering a test compound to an invertebrate; and (b) measuring a foraging behavior of the invertebrate, wherein a compound that modulates the foraging behavior of the invertebrate is characterized as a compound that modulates a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal.

BACKGROUND OF THE INVENTION

[0001] Attention Deficit/Hyperactivity Disorder (ADHD) affects approximately 4% to 6% of the U.S. population. The most common core features include excessive and pervasive forms of distract ability (poor sustained attention to tasks), impulsivity (impaired impulse control and delay of gratification) and hyperactivity (excessive activity and physical restlessness). ADHD is a clinical diagnosis which can be broken down into subtypes known as combined type, predominantly inattentive type, and predominantly hyperactive-impulsive type. Although individuals lacking the hyperactivity component of ADHD are often referred to as having ADD, or attention deficit disorder, the diagnosis of ADHD is known in the medical arts to encompass all forms described above.

[0002] ADHD is commonly diagnosed in children and usually persists throughout a person's lifetime such that approximately one-half to two-thirds of children with ADHD will continue to have significant problems with ADHD symptoms and behaviors as adults. For both children and adults, ADHD can have significant impact on their lives on the job, within the family, and in social relationships. In addition to symptoms of ADHD, it is common for people having the disorder to present with co-morbid conditions such as anxiety disorders, substance abuse, or learning disabilities.

[0003] Currently available drugs to treat ADHD are stimulant medications such as ritalin (methylphenidate), dexadrine or adderall. These stimulants can have a variety of undesirable side effects such as stomach upset, cramps, loss of appetite, diarrhea, headache, nervousness, dizziness, sleep problems, irritability or restlessness, and in the extreme: seizures or convulsions, blurred vision, heart problems and unusual bleeding or bruising.

[0004] About 50 million adults in the U.S. have hypertension, a condition to which 20 to 50% of all natural deaths can be traced. Untreated hypertension can damage the kidneys and lead to stroke, heart attack or heart failure. Currently available treatments can also cause health problems. Hypertension is usually treated with diuretics, beta blockers or calcium blockers, drugs which can cause fatigue, weakness, headache, joint and stomach aches, nausea, impotence or urinary tract problems. In some cases the potential benefits of antihypertensives may not outweigh their negative effects on quality of life. In this regard, increasing numbers of antihypertensive medications are associated with lower reported general health status.

[0005] Clearly there is a need to identify drugs that alleviate ADHD and hypertension without undesirable side effects. At present there do not exist good methods of screening for such drugs. The present invention satisfies this need and provides related advantages as well.

SUMMARY OF THE INVENTION

[0006] The invention provides a method of identifying a compound that modulates a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal. The method consists of (a) administering a test compound to an invertebrate; and (b) measuring a foraging behavior of the invertebrate, wherein a compound that modulates the foraging behavior of the invertebrate is characterized as a compound that modulates a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal.

[0007] The invention also provides a method of identifying a polynucleotide that correlates with a disease associated with nitric oxide/cGMP-dependent protein kinase network in a mammal. The method consists of (a) obtaining a first and a second strain of an invertebrate; (b) subjecting the first and second invertebrate strains to conditions in which the first strain exhibits a foraging behavior different than a foraging behavior exhibited by the second strain; (c) measuring an expression level for one or more polynucleotides in the first and second strains, and (d) identifying one or more polynucleotides that are differentially expressed in the first strain relative to the second strain, wherein a mammalian polynucleotide comprising substantially the same nucleic acid sequence as the one or more differentially expressed polynucleotides correlates with ADHD in a mammal.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008]FIG. 1 shows a schematic of the foraging maze used in measuring foraging scores of Drosophila.

[0009]FIG. 2 shows the effects of transferring flies to new tubes on rest. The amount of rest in minutes is plotted for each hour of a 24 hour day. Time 0 on the graph is at “lights-on” as described in Example IV. During baseline (squares) flies show a typical amount and distribution of rest. After being physically transferred to new tubes (circles) all flies obtained similar amounts of rest to baseline indicating that the transfer process did not alter the sleep-wake cycle.

[0010]FIG. 3 shows the effects of methylphenidate on rest in flies. The amount of rest in minutes is plotted for each hour of a 24 hour day. Time 0 on the graph is at “lights-on” as described in Example IV. During baseline (squares) flies show a typical amount and distribution of rest. After being physically transferred to new tubes containing 0.5 mg/ml methylphenidate (circles) rest was significantly reduced compared to baseline indicating that the methylphenidate can increase waking and reduce sleep in an invertebrate.

DETAILED DESCRIPTION OF THE INVENTION

[0011] The methods and compounds disclosed herein are based on the discovery that foraging behavior in an invertebrate can be changed by altering molecular components of the nitric oxide/cGMP-dependent protein kinase network. In this regard, it is demonstrated herein that administration of a compound used for treating ADHD in humans can be effective in changing foraging behavior of invertebrates. Accordingly, the present invention provides methods of rapidly and efficiently identifying compounds that modulate attention deficit hyperactivity disorder (ADHD) in a mammal. The invention also provides methods for rapidly and efficiently identifying compounds that modulate other diseases involving the nitric oxide/cGMP-dependent protein kinase network in a mammal including, for example, hypertension.

[0012] A method of identifying a compound that modulates ADHD consists of administering a test compound to an invertebrate and measuring a foraging behavior of the invertebrate. A candidate compound that modulates the foraging behavior of the invertebrate is characterized as a compound that modulates ADHD. The method can also be used to identify compounds that modulate other diseases involving the nitric oxide/cGMP-dependent protein kinase network in a mammal including, for example, hypertension. A compound identified by the methods of the invention can be used to treat an individual suffering from ADHD or hypertension.

[0013] According to the methods of the invention, changes in expression of one or more representative genes in a network of genes can be associated with changes in foraging behavior of an invertebrate to identify compounds that modulate the nitric oxide/cGMP-dependent protein kinase network. Representative genes can include one gene, a subset of genes, or a set of all the genes whose expression changes upon modulation of ADHD or hypertension in a mammal. Therefore, the present invention also provides mammalian genes that modulate ADHD or hypertension. A compound identified by the method of the invention can act to modulate the activity or expression of a mammalian gene of the invention.

[0014] Until now, there has been no indication that the relative genetic simplicity of invertebrates compared to mammals provides networks of genes controlling foraging behavior in invertebrates that are similar to those affecting behavioral disorders such as ADHD in mammals or cardiopulmonary disorders such as hypertension. In fact, invertebrates have been shown to have dissimilar pharmacological responses compared to mammals within the same class of neuronal receptors. Thus, the use of invertebrates to test compounds affecting behavioral and cardiopulmonary disorders in mammals have not been pursued previous to this disclosure.

[0015] The methods of the invention provide a means to identify compounds that modulate ADHD, hypertension, or other diseases involving the nitric oxide/cGMP-dependent protein kinase network by screening foraging behavior of invertebrates which is a natural genetically encoded network. Thus, the methods provide for screening compounds in the context of a natural system which represents the mammalian nitric oxide/cGMP-dependent protein kinase network more closely than any in vitro assay could. Additionally, the degree to which compounds modulate a mammalian system can be identified by the methods of the invention because strains of invertebrates that have naturally evolved different foraging behaviors and degrees of response can be used with the methods of the invention.

[0016] As used herein, the term “compound” refers to an inorganic or organic molecule such as a drug; a peptide, or a variant or modified peptide or a peptide-like molecule such as a peptidomimetic or peptoid; or a polypeptide such as an antibody, a growth factor, or cytokine, or a fragment thereof such as an Fv, Fd or Fab fragment of an antibody, which contains a binding domain; or a polynucleotide or chemically modified polynucleotide such as an antisense polynucleotide; or a carbohydrate or lipid.

[0017] As used herein, the term “test compound” refers to an inorganic or organic molecule that is administered to an invertebrate for the purpose of determining its effects on an invertebrate foraging behavior. A test compound can be administered as a pure preparation or as a mixture with one or more other molecule. For example, a test compound can be combined with, or dissolved in, an agent that facilitates uptake of the compound by the invertebrate, such as an organic solvent, for example, DMSO or ethanol; or an aqueous solvent, for example, water or a buffered aqueous solution; or food.

[0018] As used herein the term “administering” when used in reference to a compound and an invertebrate refers to delivering the compound to the invertebrate in a manner allowing internalization of the test compound. A compound can be delivered, for example, by ingestion, inhalation, topically, or injection. Examples of delivery by ingestion include feeding a compound to invertebrates or adding a compound to an invertebrate's food. Topical delivery includes, for example, exposing an invertebrate to an aerosol preparation of a compound or a liquid preparation of a compound such that the compound contacts-the exterior or interior membranes.

[0019] As used herein, the term “ADHD” refers to attention deficit hyperactivity disorder which is a diagnosis applied to an individual who displays symptoms of excessive distractability, impulsivity, or hyperactivity. Attention deficit hyperactivity disorder includes attention deficit disorder (ADD). Attention deficit hyperactivity disorder can be subdivided into clinical subtypes including, for example, predominantly inattentive type, predominantly hyperactive-impulsive type or combined type. An individual with attention deficit hyperactivity disorder can display a reduction in distractability, impulsivity, or hyperactivity when administered stimulants including, for example, methylphenidate (Ritalin), amphetamines such as dexedrine and adderall, or pemoline (Cylert).

[0020] As used herein, the term “hypertension” refers to abnormally high blood pressure. High blood pressure is a relative measure that depends upon a statistical estimate of the distribution of systolic and diastolic blood pressures in the general population. As blood pressure is affected by factors such as age and gender, one skilled in the art can recognize high blood pressure by comparison to a cohort of matched individuals. Abnormally high blood pressure can be recognized as deviations from normal systolic blood pressure which include, for example, 80 mm Hg for an infant, 130 mm Hg for a 20 year old male and 140 mm Hg for a 40 year old male. Symptoms of hypertension can include, for example, drowsiness, confusion, headache, nausea, and loss of vision. Other symptoms can include, for example, increased risk for arteriosclerosis, angina pectoris, sudden death, stroke, dissecting aortic aneurysm, intra cerebral hemorrhage, rupture of the myocardial wall or artherothrombotic occlusion of the abdominal aorta.

[0021] As used herein the term “modulates” refers to an increase, decrease or alteration. The term can be used to indicate an increase, decrease or alteration of a phenotype or behavior. For example, an alteration of a foraging behavior can be a loss or gain of a behavior. An increased foraging behavior can be an increased duration of search, frequency of searching, energy expended in searching, distance of search, rate of search, area of search, or diligence of search. Accordingly, a decreased foraging behavior refers to, for example, a decreased duration of search, frequency of searching, energy expended in searching, distance of search, rate of search, area of search, or diligence of search. An alteration of phenotype can be, for example, a loss or gain of a characteristic associated with a particular genotype. Examples of increased phenotype include increased foraging behavior in a Rover or decreased foraging behavior in a sitter. A decreased phenotype can include, for example, decreased foraging behavior in a Rover or increased foraging behavior in a sitter.

[0022] A condition can be modulated by altering severity, extent, intensity, magnitude, duration or frequency of the condition. Modulating a condition can also include, for example, altering severity, extent, intensity, magnitude, duration or frequency of a symptom of the condition. Modulation of an expression level of a polynucleotide includes, for example, an increase in an amount of polynucleotide or polypeptide produced from the polynucleotide or decrease in an amount of polynucleotide or polypeptide produced from the polynucleotide. Modulation of an expression level of a polypeptide includes, for example, an increase in an amount of polypeptide produced from a polynucleotide or decrease in an amount of polypeptide produced from a polynucleotide. Modulation of an activity of a polypeptide includes, for example, increasing catalytic rate, decreasing catalytic rate, changing binding specificity, increasing binding rate or decreasing binding rate.

[0023] As used herein the term “invertebrate” refers to an animal that lacks a backbone. Invertebrates are understood to refer to members of the division invertebrata as understood in the art. An invertebrate can be a fly including, for example, fruit flies, sand flies, mayflies, blowflies, flesh flies, face flies, houseflies, screw worm-flies, stable flies, mosquitos, northern cattle grub, and the like As disclosed herein, is an example of an invertebrate. Fruit flies include Drosophila species such as D. melanogaster, D. simulans, D. virilis, D. pseudoobscura D. funebris, D. immigrans, D. repleta, D. affinis, D. saltans, D. sulphurigaster albostrigata and D. nasuta albomicans. Other invertebrate insects include, for example, cockroaches, honeybees, wasps, termites, grasshoppers, moths, butterflies, fleas, lice, boll weevils, beetles, Apis mellifera, A. florea, A. cerana, Tenebrio molitor, Bombus terrestris, B. lapidarius, and members of Hydrocorisae. Arthropods are also invertebrates including, for example, scorpions, spiders, mites, crustaceans, centipedes and millipedes. Other invertebrates include, for example, flatworms, nematodes (e.g. C. elegans), mollusks (e.g. Aplysia or Hermissenda), echinoderms and annelids will exhibit foraging behavior and express polynucleotides associated with foraging behavior, and can be used in the methods of the invention.

[0024] As used herein, the term “reference invertebrate” refers to a member of the division invertebrate or population of such members for which a polynucleotide expression level has been measured such that a polynucleotide expression level for another invertebrate can be compared. A polynucleotide expression level for a population of invertebrates can be an average, mean or median value derived from polynucleotide expression levels of individual invertebrates in the population. A reference invertebrate can be the same or different species as the invertebrate with which expression levels will be compared.

[0025] As used herein the term “adult,” when used in reference to Drosophila melanogaster refers to an individual that has passed the pupal stage. An adult can be distinguished from individuals in other developmental stages according to the presence of wings and fertility to reproduce.

[0026] As used herein the term “larva,” when used in reference to Drosophila melanogaster refers to an individual that has hatched from an egg and has not yet formed an immobile pupa. The larval stages include, for example, first, second and third instars.

[0027] As used herein the term “mammal” refers to a vertebrate animal distinguishable from other vertebrate animals, for example, by self regulating body temperature, hair or, in the female, mammee. Examples of mammals include a human, dog, cat, or horse.

[0028] As used herein the term “foraging behavior” refers to the actions of an individual or population of individuals in the presence of food or in a fed state. Actions included in the term can be for example, duration of search, frequency of searching, energy expended searching, distance of search, direction of search, rate of search, area of search, efficiency of search or diligence of search.

[0029] As used herein the term “different,” when used in reference to foraging behavior of individuals, refers to a measured value of a foraging behavior of a first invertebrate or a mean value of foraging behaviors measured for individuals in a population that is different from a measured value of a foraging behavior of a second invertebrate or a mean value of foraging behaviors measured for individuals in a second population if a pairwise t-test of two scores is significantly different at the 0.05 level, or if multiple pairwise comparisons between individuals are significantly different after applying a correction for experiment wise-error. A significantly different score refers to a score that is different by a statistically meaningful amount. Two foraging behaviors are also considered different if a first measured value for a foraging behavior is not within a desired region of the probability distribution of a second measured value for a foraging behavior. For example, a first measured value for a foraging behavior can be different if it is not within the 80% probable region of a probability distribution of the second measured value of a foraging behavior, or within the 85%, 90%, 95% or 98% probable region of the distribution of the second measured value of a foraging behavior. Correspondingly, measured values of a foraging behavior considered to be substantially the same are values that do not differ by a more than a desired standard deviation or are within a desired probable region of a probability distribution. Methods for the determination of mean, standard deviation and characteristics of normal distributions are known in the art as demonstrated by texts such as Biostatistical Analysis, 4th ed., Zar, Prentice-Hall Inc. (1999).

[0030] As used herein the term “sitter” refers to an individual Drosophila melanogaster that is homozygous for for^(S) alleles. The term “Rover” as used herein refers to an individual Drosophila melanogaster that contains a for^(R) allele. Rover/sitter is a polymorphism in Drosophila melanogaster. The phenotype of a Rover and sitter can be distinguished, for example, according to the distance traveled during foraging such that a Rover travels a longer pathlength when foraging than does a sitter. In the absence of food, or in a starved condition, Rovers and sitters show similar mobility.

[0031] As used herein the term “expression level,” when used in reference to a polynucleotide, refers to a quantity of translation or transcription product produced from a polynucleotide in a given condition. A polynucleotide expression level can be, for example, an amount of RNA translated from a DNA, an amount of polypeptide translated from an RNA or an amount of polypeptide produced from transcription and translation of a DNA. Accordingly, increased expression refers to a larger quantity of transcription or translation product produced in a time period or a faster rate of producing transcription or translation products. Correspondingly, decreased expression refers to a smaller quantity of transcription or translation product produced in a time period or a slower rate of producing transcription or translation products.

[0032] As used herein the term “differentially expressed” refers to dissimilar quantities of translation or transcription product produced from a polynucleotide in a given condition. Dissimilar quantities can be identified by the variance relative to a reference quantity. A reference quantity can be, for example, the variability of expression levels between invertebrates having, for example, similar genetic makeup, age, gender, developmental conditions, or the like. In such a situation, a different level can be a difference that is greater than the mean difference observed between expression levels of the invertebrates, or greater than the largest expression level difference observed between selected polynucleotides of the invertebrates. Different levels can also be based on the composite variability of polynucleotide expression levels between two or more strains. For example, the mean or median difference between polynucleotide expression levels can be determined between a large number of different strains. Any difference in expression that is greater than the mean or median difference can be considered differentially expressed. Other reference levels defining a significant difference can be determined by one of skill in the art according to the desired comparison between two or more invertebrates.

[0033] Differential expression can also be determined for invertebrates of the same strain that have been subjected to conditions in which a first group of members of a strain exhibit a foraging behavior different from the foraging behavior of a second group of members of the strain. This can be carried out, for example, by administration of a compound, presence of light, time of day, and the like. Differential expression is then determined by measuring expression levels in the two groups and identifying polynucleotides expressed at significantly different levels.

[0034] As used herein, a “strain” refers to a population of organisms of a species having at least one similar phenotype including, for example, a foraging phenotype. This population of organisms can have either identical or a somewhat heterogeneous genetic makeup, although heterogeneous populations typically contain individuals that are homozygous for one or more chromosomes. For example, a population of organisms having a similar phenotype can be a population of organisms of a species sharing a similar genetic origin as the result of either being isolated from a particular geographic area, sharing particular chromosomes or alleles, or having been bred for multiple generations for a particular phenotype.

[0035] The term “conditions” when used in the context of invertebrate foraging behavior refers to environmental and biological factors that can influence invertebrate foraging behavior. Influences on invertebrate foraging behavior can cause, for example, increase, decrease or modification of a foraging behavior. Environmental factors encompass the physical environment such as temperature, pressure, light intensity, light position, and the like; components of the gaseous environment such as humidity, % oxygen, presence of a compound such as a drug or hormone, and the like; presence or absence of food; food quality; food quantity; food composition, and the structural makeup of the chamber in which the invertebrate is housed, including volume, particularly as it influences density of invertebrates, shape, composition of the chamber, and the like. Biological factors that can influence invertebrate foraging can include genetic factors including presence of particular alleles of genes or chromosomes, either naturally occurring or induced in the laboratory, biorhythmic factors such as time of day, relative activity level of an invertebrate, length of time an invertebrate has been active, and the like. Biological factors also include biochemical factors such as developmental and hormonal state of an invertebrate, fasting state of the invertebrate, presence in the invertebrate of an administered compound, gender and age.

[0036] As used herein, the term “polynucleotide molecule” refers to both deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) molecules, and can optionally include one or more non-native nucleotides, having, for example, modifications to the base, the sugar, or the phosphate portion, or having a modified phosphodiester linkage. The term polynucleotide molecule includes both single-stranded and double-stranded polynucleotides, representing the sense strand, the anti-sense strand, or both, and includes linear, circular or branched molecules. A polynucleotide molecule of the invention can contain 2 or more nucleotides. Exemplary polynucleotide molecules include genomic DNA, cDNA, mRNA and oligonucleotides, corresponding to either the coding or non-coding portion of the molecule, and optionally containing sequences required for expression. A polynucleotide molecule of the invention, if desired, can additionally contain a detectable moiety, such as a radiolabel, a fluorochrome, a ferromagnetic substance, a luminescent tag or a binding agent such as biotin.

[0037] As used herein, the term “isolated,” when used in reference to a polynucleotide molecule, is intended to mean that the molecule is substantially removed or separated from components with which it is naturally associated, or otherwise modified by a human hand, thereby excluding polynucleotide molecules as they exist in nature. An isolated polynucleotide molecule of the invention can be in solution or suspension, or immobilized on a filter, glass slide, chip, culture plate or other solid support. The degree of purification of the polynucleotide molecule, and its physical form, can be determined by those skilled in the art depending on the intended use of the molecule.

[0038] As used herein, a “fragment” of a polynucleotide refers to a portion of a polynucleotide that includes at least 2 contiguous nucleotides of the polynucleotide. A mammalian polynucleotide can be substantially the same as an invertebrate polynucleotide, for example, when a fragment of a mammalian polynucleotide that encode a polypeptide domain corresponds to a polypeptide domain encoding fragment of an invertebrate polynucleotide. Such a fragment typically is encoded by at least 30 nucleotides, and the mammalian and invertebrate polynucleotides encoding that fragment share at least about 50% identity, at least about 60% identity, at least about 70% identity, at least about 80% identity, at least about 90% identity, at least about 95% identity or at least about 98% identity. Methods for determining that a fragment of a mammalian polynucleotide is substantially the same as an invertebrate polynucleotide or a fragment of an invertebrate polynucleotide include those described above for comparing mammalian and invertebrate polynucleotides. Such a fragment can be encoded by 30 or more nucleotides, for example, 45 or more nucleotides, 60 or more nucleotides, 90 or more nucleotides, 150 or more nucleotides, 210 or more nucleotides, or 300 or more nucleotides.

[0039] Biological functions retained by a fragment can include the ability to modulate ADHD or hypertension in a mammal, the ability to modulate invertebrate foraging, the ability to bind an antibody that binds to a full-length polypeptide which comprises the fragment, or an enzymatic or binding activity characteristic of the full length polypeptide.

[0040] The term “substantially the same” as used herein in reference to the relationship between a mammalian polynucleotide and an invertebrate polynucleotide refers to a mammalian polynucleotide or corresponding amino acid sequence that has a high degree of homology to an invertebrate polynucleotide or corresponding amino acid sequence and retains at least one function specific to the invertebrate polynucleotide or corresponding amino acid sequence. In the case of a nucleotide sequence, a first polynucleotide that is substantially the same as a second polynucleotide can selectively hybridize to a sequence complementary to the second polynucleotide under moderately stringent conditions or under highly stringent conditions. Therefore, a first polynucleotide molecule having substantially the same sequence compared to a second polynucleotide sequence can include, for example, one or more additions, deletions or substitutions with respect to the second sequence so long as it can selectively hybridize to a complement of that sequence.

[0041] In the case of an amino acid sequence, a first amino acid sequence that is substantially the same as a second amino acid sequence can contain minor modifications with respect to the second amino acid sequence, so long as the polypeptide containing the first amino acid sequence retains one or more functional activities exhibited by the whole polypeptide containing the second amino acid sequence.

[0042] Typically, a substantial similarity for a polypeptide or polynucleotide is represented by at least about 20% identity between mammalian and invertebrate sequences; mammalian and invertebrate sequences that are substantially the same can also share at least about 30% identity, at least about 40% identity, at least about 50% identity, at least about 60% identity, at least about 70% identity, at least about 80% identity, at least about 90% identity, at least about 95% identity, at least about 97% identity, or at least about 99% identity over the length of the two sequences being compared. Those skilled in the art know that two or more polypeptides having low overall sequence similarity can be substantially similar if the polypeptides have similar domains with substantial sequence similarity. For example, polypeptides having 20% overall identity can be substantially similar if the polypeptides contain one or more domains of substantial similarity. A larger number of similar domains between two or more polypeptides correlates with increased similarity. Therefore, substantial similarity can be identified according to sequence identity within similar domains of two or more polypeptides. Examples of methods for determining substantial similarity using sequence identity or a combination of sequence identity and similarity in domain structure are described below.

[0043] As used herein, the term “phenotype” refers to a set of detectable outward manifestations that are correlated with one or more allele or genetic loci of an organism. A set can include one, all, or a portion of all of the detectable outward manifestations. A phenotype can include detectable outward manifestations that are correlated with an entire genotype of an organism or a subset of alleles of a specific genotype. An example of a phenotype is Rover which is manifested, in part, by exploration of a broad area or long distance when an individual having the for^(R) allele perceives the presence of food or is in a fed state. The for^(S) allele is manifested in homozygous individuals, for example, as a phenotype which includes exploration of a limited area or short distance in the presence of food or when in a fed state.

[0044] As used herein, the term “antibody” is consistent with the definition of the term in the art and includes polyclonal and monoclonal antibodies, as well as antigen binding fragments of such antibodies. An antibody of the invention is characterized by having specific binding activity for a polypeptide associated with invertebrate foraging or with modulation of ADHD or hypertension in a mammal of at least about 1×10⁵ M⁻¹. Thus, Fab, F(ab′)₂, Fd and Fv fragments of a polypeptide-specific antibody of the invention, which retain specific binding activity, are included within the definition of an antibody. Methods of preparing polyclonal or monoclonal antibodies against polypeptides are well known in the art (see, for example, Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press (1988)).

[0045] In addition, the term “antibody” as used herein includes naturally occurring antibodies as well as non-naturally occurring antibodies, including, for example, single chain antibodies, chimeric, bifunctional and humanized antibodies, as well as antigen-binding fragments thereof. Such non-naturally occurring antibodies can be produced or obtained by methods known in the art, including constructing the antibodies using solid phase peptide synthesis, recombinant production, or screening combinatorial libraries consisting of variable heavy chains and variable light chains.

[0046] The invention provides a method of identifying a compound that modulates a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal. The method consists of administering a test compound to an invertebrate; and (b) measuring a foraging behavior of the invertebrate, wherein a compound that modulates the foraging behavior of the invertebrate is characterized as a compound that modulates a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal.

[0047] Thus, the invention provides, for example, a method of identifying a compound that modulates ADHD in a mammal. The method consists of (a)administering a test compound to an invertebrate; and (b) measuring a foraging behavior of the invertebrate, wherein a compound that modulates the foraging behavior of the invertebrate is characterized as a compound that modulates ADHD in a mammal.

[0048] The invention also provides a method of identifying a compound that modulates hypertension in a mammal. The method consists of (a) administering a test compound to an invertebrate; and (b) measuring a foraging behavior of the invertebrate, wherein a compound that modulates the foraging behavior of the invertebrate is characterized as a compound that modulates hypertension in a mammal.

[0049] A foraging behavior of an invertebrate can be observed as an action of the invertebrate in the presence of food or in a fed state. A behavior of an invertebrate that is preferentially or even exclusively displayed in the presence of food as compared to in the absence of food or in the fed state as compared to a starved state is desirable in the methods of the invention. Those behaviors for which a modulation can be quantified or reproducibly evaluated are preferred in the invention and include, for example, duration of search, frequency of searching, energy expended searching, distance of search, direction of search, rate of search, area of search, efficiency of search or diligence of search. One skilled in the art will be able to choose a particular behavior to be observed according to the features of the invertebrate as they affect behavior including, for example, means of locomotion such as flying or walking or developmental stage of the invertebrate.

[0050] An example of a foraging behavior useful in the methods of the invention is a phenotypic behavior associated with the foraging (for) gene. The for gene can be functionally characterized based on its influence on food search behavior in fruit flies as described for example in, Sokolowski, Behav. Genet. 10:291 (1980) and de Belle et al., Genetics 123:157 (1989). Individuals with a for^(R) allele are referred to as Rovers and explore a considerably wider area in the presence of food than do sitters who are homozygous for the for^(S) allele as described for example in de Belle et al., Heredity 59:73 (1987). Rovers and sitters display differences in foraging behavior in both the larval and adult life stages as described, for example, in Periera et al. Proc. Natl. Acad. Sci. USA 90:5044 (1993). Larval Rover and sitter strains do not significantly differ in activity in the absence of food but adults do differ somewhat. However, there is a food dependent differential effect on adult behavior in the presence of food.

[0051] An advantage of the invention is that the methods can be used to distinguish a compound that has a specific effect on ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal from a compound that has a non-specific effect. Because Rovers and sitters are both wild type forms that exist at appreciable frequencies in natural populations, the phenotypic behaviors displayed by both can be considered as normal, wild type behaviors. Changes in foraging behavior that cause a sitter to behave more like a Rover or a Rover to behave more like a sitter can be identified as having a specific effect on foraging. Therefore, the methods of the invention can be used to distinguish a compound that has a specific effect on foraging behavior from a compound that induces aberrant behavior by non-specific effects. In this regard, a compound that has a non-specific effect can be identified, for example, as a compound that in some way hinders or disables an invertebrate from foraging effectively. A compound identified by the methods of the invention as having a specific effect on foraging behavior, for example, by causing a sitter to behave more like a Rover or a Rover to behave more like a sitter, can be further identified as a compound having a specific effect on ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal.

[0052] An example of a foraging behavior that can be evaluated in invertebrates is an area explored in the presence of food or in a fed state. Any method appropriate for the particular invertebrate can be used to introduce food. Foraging behavior of the invertebrate in response to food can be determined by a variety of means. For example, foraging behavior can be measured as a distance traversed in a fed state or a rate at which an identified distance is traversed. An exemplary method for determining foraging behavior in a population of adult Drosophila by determining a distance traveled in a defined amount of time is presented in Example I. As described in Example I, the number of flies that reach a collection tube in the chamber shown in FIG. 1 before a defined point in time can be summed and expressed as a percentage of the total number of flies tested to yield a foraging score. In addition, a foraging score for a population of flies can be determined by measuring the mean pathlength traversed by each fly in a defined amount of time. Foraging scores provide a convenient measure allowing quantitative determination of foraging behavior and comparison of foraging behaviors in different strains and/or different conditions.

[0053] Various manual and automated assays can be used to evaluate foraging behavior. For example, activity can be detected visually, either by direct observation or by photographic means. Additionally, an automated monitoring system can be used to detect motion at a specific distance from a food source including, for example, light beam detectors commonly employed in chromatographic fraction detectors or motion detectors commonly employed in security systems. As a further example, an infrared monitoring system, such as the infrared Drosophila Activity Monitoring System available from Trikinetics (described in M. Hamblen et al., J. Neurogen. 3:249 (1986)), can be used. Automated detection systems are advantageous when simultaneously evaluating activity in large numbers of invertebrates.

[0054] Those skilled in the art can determine an appropriate method to evaluate foraging behavior in a particular application of the method, depending on considerations such as the size and number of invertebrates, their normal activity level, the intended number of data points, and whether a quantitative or qualitative assessment of activity is desired.

[0055] An invertebrate of the invention can be an insect including for example a Drosophila melanogaster. Invertebrates are understood to refer to members of the division invertebrate. As disclosed herein, Drosophila melanogaster is an example of an invertebrate that exhibits foraging behavior that can be measured. Those skilled in the art understand that other Drosophila species are also likely to exhibit similar foraging behavior and express polynucleotides associated with foraging behavior, including D. simulans, D. virilis, D. pseudo obscura D. funebris, D. immigrans, D. repleta, D. affinis, D. saltans, D. sulphurigaster albostrigata and D. nasuta albomicans. Likewise, other flies, including, sand flies, mayflies, blowflies, flesh flies, face flies, houseflies, screw worm-flies, stable flies, mosquitos, northern cattle grub, and the like will also exhibit foraging behavior and express polynucleotides associated with foraging behavior.

[0056] Furthermore, insects other than flies can also exhibit foraging behavior and express polynucleotides associated with foraging behavior. For example, the invention can also be practiced with insects such as cockroaches, honeybees, wasps, termites, grasshoppers, moths, butterflies, fleas, lice, boll weevils, beetles, Apis mellifera, A. florea, A. cerana, Tenebrio molitor, Bombus terrestris, B. lapidarius, and members of Hydrocorisae.

[0057] Arthropods other than insects also can exhibit foraging behavior and express polynucleotides associated with foraging behavior. For example, the invention can also be practiced using arthropods such as scorpions, spiders, mites, crustaceans, centipedes and millipedes.

[0058] Due to the high degree of genetic similarity across invertebrate species, invertebrates other than arthropods can exhibit foraging behavior and express polynucleotides associated with foraging behavior. For example, foraging behaviors of nematodes such as those described for C. elegans in deBono and Bargmann, Cell 94:679-689 (1998) can be useful in the methods of the invention. Other invertebrates useful in the invention include, for example, flatworms, mollusks (e.g. Aplysia or Hermissenda), echinoderms and annelids all of which can exhibit foraging behavior and express polynucleotides associated with foraging behavior.

[0059] Those skilled in the art can determine, using the assays described herein, whether a particular invertebrate exhibits foraging behavior and expresses polynucleotides associated with foraging behavior and, therefore, would be applicable for use in the methods of the invention. The choice of invertebrate will also depend on additional factors, for example, the availability of the invertebrates, the normal activity levels of the invertebrates, the availability of molecular probes for polynucleotides associated with foraging behavior, the number of invertebrates and compounds one intends to use, the ease and cost of maintaining the invertebrates in a laboratory setting, the method of administering and type of compounds being tested, and the particular property being evaluated. Those skilled in the art can evaluate these factors in determining an appropriate invertebrate to use in the screening methods.

[0060] For example, if it is desired to evaluate gene expression in the methods of the invention, an invertebrate that is genetically well-characterized, such that homologs of genes associated with foraging behavior are known or can be readily determined, can be used. Thus, appropriate invertebrates in which to evaluate gene expression can include, for example, Drosophila and C. elegans. If it desired to evaluate behavioral properties in the methods of the invention, an invertebrate that exhibits one or more foraging behaviors, such as fruit flies, cockroaches, honeybees, wasps, moths, mosquitos, scorpions, and the like, can be used.

[0061] An invertebrate of the invention can be at any stage of development so long as a foraging behavior can be measured. For example, a Drosophila melanogaster can be an adult or larva. Larvae display the canonical foraging behavior as originally defined and used to identify the naturally occurring Rover and sitter alleles, show natural bimodal distribution in populations and map to the for locus [Sokolowski, M. Behav. Genet. 10: 291-302 (1980); de Belle et al., Genetics 123: 157-163 (1989)]. Adults have the advantage of being testable either as individuals or in a population assay, as well as having most of the genetically identifiable markers to permit construction and testing of various genotypes. In comparison to larvae, adults have much larger brains providing more convenient tissue samples for biochemical and molecular analysis (about 200,000 neurons vs. about 10,000) and can provide more easily enriched brain preparations due to ease of isolation of heads from adults compared to isolation of whole bodies from larvae. Differences between adult and larval flies are described, for example, in Ashburner, Drosophila: A Laboratory Handbook, Cold Spring Harbor Laboratory Press: Cold Spring Harbor, N.Y. (1989).

[0062] The methods of the invention can be practiced by contacting an invertebrate with a candidate compound and evaluating its effect on a foraging behavior. An appropriate method of administering a compound to an invertebrate can be determined by those skilled in the art and will depend, for example, on the type and developmental stage of the invertebrate, whether the invertebrate is active or inactive at the time of administering, whether the invertebrate is exhibiting a foraging behavior at the time of administering, the number of animals being assayed, and the chemical and biological properties of the compound (e.g. solubility, digestibility, bioavailability, stability and toxicity). For example, as shown in Example IV below, ritalin can be administered to Drosophila melanogaster by dissolving the drugs in fly food and providing the food to the flies.

[0063] A candidate compound can be administered to an invertebrate in a single dose, or in multiple doses. It is expected that the modulation of invertebrate foraging behavior will be dose dependent. An effective amount of a compound used in the methods of the invention can be determined by those skilled in the art, and can depend on the chemical and biological properties of the compound and the method of contacting the invertebrate. Exemplary concentration ranges to test include from about 10 μg/ml to about 500 mg/ml, such as from about 100 μg/ml to 250 mg/ml, including from about 1 mg/ml to 200 mg/ml.

[0064] The appropriate time and duration to administer a compound can be determined by those skilled in the art depending on the application of the method. For example, it may be desirable to administer a compound prior to introducing food to the invertebrate, in the presence of food for a defined duration, or continuously in the presence of food, depending on the foraging behavior being evaluated, the mode of administration, the rate at which the drug compound has an effect, the duration of the compound's affect and the desired effect of the compound. If desired, a candidate compound can be combined with, or dissolved in, an agent that facilitates uptake of the compound by the invertebrate, such as an organic solvent, for example, DMSO or ethanol; or an aqueous solvent, for example, water or a buffered aqueous solution; or food. One skilled in the art will know the appropriate formulation based on the method of delivery to be used.

[0065] A compound used to contact the invertebrate can be identified as any molecule that potentially alters foraging. Additionally, a compound to be used in the methods of the invention can be identified based on presumed or predicted activity in ADHD, hypertension, or other disease associated with a NO/cGMP-dependent kinase network in a mammal as indicated for example by molecular properties, interactions observed at the molecular or cellular level, clinical evidence, or other empirical evidence known to one skilled in the art to be predictive of such activities.

[0066] In regard to molecular interactions, a compound for use in the methods of the invention can be identified based on its ability to alter polynucleotide or polypeptide expression or activity. For example, a compound can directly interact with a gene promoter; can interact with transcription factors that regulate polynucleotide expression; can bind to or cleave a polynucleotide transcript (e.g. antisense polynucleotides or ribozymes); can alter half-life of the transcript; or can itself be an expressible polynucleotide associated with invertebrate foraging or with ADHD, hypertension or a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal. A compound can also be identified or designed based on inhibition or activation of a cellular component known to be involved in ADHD, hypertension or a disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal. For example, a compound can specifically bind to a polypeptide and alter its activity or half-life or a compound can bind to a substrate or modulator of a polypeptide.

[0067] A candidate compound can be a naturally occurring macromolecule, such as a peptide, polynucleotide, carbohydrate, lipid, or any combination thereof, or a partially or completely synthetic derivative, analog or mimetic of such a macromolecule. A candidate compound can also be a small organic or inorganic molecule, either naturally occurring, or prepared partly or completely by synthetic methods.

[0068] The methods of the invention can be performed with a single compound or by screening a number of compounds, including for example, a library of compounds. The number of different compounds to screen using the methods of the invention can be determined by those skilled in the art depending on the application of the method. For example, a smaller number of candidate compounds would generally be used if the type of compound that is likely to modulate foraging behavior in an invertebrate, or ADHD, hypertension or other disease associated with nitric oxide/cGMP-dependent protein kinase network in a mammal, is known or can be predicted, such as when derivatives of a lead compound are being tested. However, when the type of compound that is likely to modulate foraging behavior is unknown, it is generally understood that the larger the number of candidate compounds screened, the greater the likelihood of identifying a compound that modulates foraging behavior. Therefore, the methods of the invention can employ screening individual compounds separately or populations of compounds including small populations and large or diverse populations, to identify a compound that modulates foraging behavior, and thereby also modulates ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal.

[0069] Methods for producing libraries of compounds to use in the methods of the invention, including chemical or biological molecules such as simple or complex organic molecules, metal-containing compounds, carbohydrates, peptides, polypeptides, peptidomimetics, carbohydrates, lipids, polynucleotides, antibodies and the like, are well known in the art (see, for example, in Huse, U.S. Pat. No. 5,264,563; Francis et al., Curr. Opin. Chem. Biol. 2:422-428 (1998); Tietze et al., Curr. Biol., 2:363-371 (1998); Sofia, Mol. Divers. 3:75-94 (1998); Eichler et al., Med. Res. Rev. 15:481-496 (1995); Gordon et al., J. Med. Chem. 37: 1233-1251 (1994); Gordon et al., J. Med. Chem. 37: 1385-1401 (1994); Gordon et al., Acc. Chem. Res. 29:144-154 (1996); Wilson and Czarnik, eds., Combinatorial Chemistry: Synthesis and Application, John Wiley & Sons, New York (1997), Gold et al., U.S. Pat. Nos. 5,475,096 (1995), 5,789,157 (1998), and 5,270,163 (1993)). The advantage of using such a combinatorial library is that molecules do not have to be individually generated to identify a compound that modulates foraging behavior in an invertebrate. Also, no prior knowledge of the exact characteristics of molecular components associated with foraging behavior in an invertebrate or ADHD, hypertension, or a disease associated with a NO/cGMP-dependent kinase network in a mammal is required when using a combinatorial library. Libraries containing large numbers of natural and synthetic compounds also can be individually synthesized or obtained from commercial sources.

[0070] Following contacting an invertebrate with a compound, a foraging behavior can be evaluated and a determination made as to the effect of the compound on the foraging behavior. A compound can have a variety of effects on foraging including, for example, changing a foraging behavior, increasing a foraging behavior, or. decreasing a foraging behavior. A changed foraging behavior that occurs following administration of a compound can be observed, for example, as a new strategy of foraging not observed in the absence of the compound. An increased foraging behavior can be observed, for example, as increased duration of search, frequency of searching, energy expended searching, distance of search, rate of search, area of search or diligence of search.

[0071] A compound can also have the effect of changing a foraging phenotype. In this regard, a foraging phenotype can be changed such that the phenotype is decreased. In this regard, a compound identified by the methods of the invention can decrease a foraging phenotype in a sitter. For example, a sitter which normally forages a relatively short distance for food can be changed to forage a greater distance by a compound administered in the methods of the invention. A compound identified by the methods of the invention can also decrease a foraging phenotype of a Rover. For example, administration of a compound to a Rover who normally explores a large distance in the presence of food, can induce the Rover to decrease the distance traveled in the presence of food. A compound can change a foraging phenotype by increasing a foraging phenotype. For example, a compound can further decrease foraging behavior in a sitter or further increase foraging behavior in a Rover.

[0072] A compound of the invention can modulate the expression or activity of one or more mammalian polypeptides. A compound that modulates the expression of a polypeptide can, for example, increase or decrease the quantity of the polypeptide produced from a gene or other polynucleotide. For example, a compound can affect the transcription of a DNA or the translation of an RNA encoding the polypeptide. An activity of a polypeptide that when modulated can cause modulation of foraging behavior in an invertebrate can include stability to proteolysis or other form of cellular inactivation, binding activity with a ligand, enzymatic activity, binding activity with other cellular components, or susceptibility to post-translational modifications such as phosphorylation, prenylation, iso prenylation and the like.

[0073] As described herein, a compound that modulates invertebrate foraging behavior can also modulate ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal. A compound identified by the methods of the invention as modulating foraging behavior in an invertebrate can decrease the severity of ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal or a symptom thereof. For example, a compound that decreases foraging behavior of an invertebrate can decrease the severity of ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal or a symptom thereof. Thus, a compound that decreases foraging behavior of a Rover can decrease the severity of ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal or a symptom thereof. In addition, a compound that increases foraging behavior of an invertebrate can decrease the severity of ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal or a symptom thereof. Thus, a compound that increases foraging behavior of a sitter can decrease the severity of ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal or a symptom thereof.

[0074] A compound of the invention can modulate ADHD or hypertension in any mammal including, for example, a human. Modulation of ADHD can be identified as any change in severity of ADHD or change in a symptom of ADHD including, for example, a decrease or removal thereof. For example, a compound identified by the methods of the invention to modulate foraging behavior in an invertebrate can also be identified as modulating distract ability, impulsivity, or hyperactivity in a human having ADHD. Modulation of hypertension can similarly be identified as any change in severity of hypertension or change in a symptom of hypertension including, for example, a decrease or removal thereof. Thus, a compound identified by the methods of the invention to modulate foraging behavior in an invertebrate can also be identified as modulating abnormally high blood pressure; drowsiness; confusion; headache; nausea; loss of vision; or increased risk for arteriosclerosis, angina pectoris, sudden death, stroke, dissecting aortic aneurysm, intra cerebral hemorrhage, rupture of the myocardial wall or artherothrombotic occlusion of the abdominal aorta in an individual having hypertension.

[0075] A compound identified by the methods of the invention can modulate other diseases associated with a NO/cGMP-dependent protein kinase network in a mammal including, for example, diabetes (Tooke, J. Diabetes Complications 14:197-200 (2000)); atherosclerosis (Bundy, et al., Gen. Pharmacol. 34:73-84 (2000)); coronary artery disease (Ijem and Granlie, S. D. J. Med. 53:489-491 (2000)); cirrhosis (Knotek et al., Can. J. Gastroenterol. 14:112D-121D (2000)).; asthma and bronchitis Ratjen, Pediatr. Allergy Immunol. 11:230-235 (2000)); uveitis, retinopathy, macular degeneration, glaucoma and myopia (Chiou et al., J. Ocul. Pharmacol. Ther. 16:407-418 (2000)); nonalcoholic steatohepatitis associated with obesity (Garcia-Monzon et al., J. Hepatol 33:716-724 (2000)); Duchenne muscular dystrophy (Sander et al., Proc. Natl. Acad. Sci. USA 97:13818-13823 (2000)); and sleep apnea (Kato, Circulation 102:2607-2610 (2000)).

[0076] A method of identifying a compound that modulates a disease associated with nitric oxide/cGMP-dependent protein kinase network in a mammal can consist of (a) administering a test compound to an invertebrate; (b) measuring an expression level for one or more polynucleotides in the invertebrate; and (c) comparing the expression level for one or more polynucleotides in the invertebrate to an expression level of one or more polynucleotides in a reference invertebrate, a compound having the effect of modulating the expression level of one or more polynucleotides associated with invertebrate foraging behavior in the test invertebrate relative to the reference invertebrate is identified as a compound that modulates a disease associated with nitric oxide/cGMP-dependent protein kinase network in a mammal.

[0077] Therefore, a method of identifying a compound that modulates ADHD in a mammal can consist of (a) administering a test compound to an invertebrate; (b) measuring an expression level for one or more polynucleotides in the invertebrate; and (c) comparing the expression level for one or more polynucleotides in the invertebrate to an expression level of one or more polynucleotides in a reference invertebrate, a compound having the effect of modulating the expression level of one or more polynucleotides associated with invertebrate foraging behavior in the test invertebrate relative to the reference invertebrate is identified as a compound that modulates ADHD in a mammal.

[0078] Additionally, a method of identifying a compound that modulates hypertension in a mammal can consist of (a) administering a test compound to an invertebrate; (b) measuring an expression level of one or more polynucleotides in the invertebrate; and (c) comparing the expression level of one or more polynucleotides in the invertebrate to an expression level of one or more polynucleotides in a reference invertebrate, a compound having the effect of modulating the expression level of one or more polynucleotides associated with invertebrate foraging behavior in the test invertebrate relative to the reference invertebrate is identified as a compound that modulates hypertension in a mammal.

[0079] One or more polynucleotides having modulated expression levels in the methods of the invention can be selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104. A mammalian polynucleotide can be identified having substantially the same sequence as a polynucleotide sequence selected from the above group.

[0080] According to the invention a compound that has the effect of increasing expression of one or more polynucleotides in an invertebrate relative to a reference invertebrate can be identified as a compound that decreases severity of ADHD or hypertension. Additionally, a compound that has the effect of decreasing expression of a specific polynucleotide in a test invertebrate relative to a reference invertebrate can be identified as a compound that decreases a symptom of ADHD or hypertension.

[0081] A variety of assays well known in the art can be used to evaluate expression of particular polynucleotides or polypeptides associated with invertebrate foraging behavior including, for example, the invertebrate polynucleotides or polypeptides comprising NOS: 1-105. Assays that detect mRNA expression generally involve hybridization of a detectable agent, such as a complementary primer or probe, to the polynucleotide molecule. Such assays include, for example, RNA or dot blot analysis, primer extension, RNase protection assays, reverse-transcription PCR, competitive PCR, real-time quantitative PCR (TaqMan PCR), polynucleotide array analysis, and the like.

[0082] Additionally, constructs containing the promoter of a gene associated with foraging behavior in an invertebrate or with ADHD, hypertension, or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal can be functionally fused to a reporter gene (e.g. β-galactosidase, green fluorescent protein, luciferase) using known methods, and used to generate transgenic invertebrates. Such transgenic invertebrates can be used in the methods of the invention wherein expression of the reporter gene can be a marker for expression of a polynucleotide that modulates ADHD, hypertension or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal.

[0083] Those skilled in the art will appreciate that the methods of the invention can be practiced in the absence of knowledge of the sequence or function of the polynucleotides associated with ADHD, hypertension, or other disease associated with a nitric oxide/cGMP-dependent protein kinase network in a mammal or a foraging behavior in an invertebrate. Expression of such polynucleotides can thus be evaluated using assays that examine overall patterns of polynucleotide expression characteristic of a foraging behavior. It will be understood that as these polynucleotides are identified or sequenced, specific probes, primers, antibodies and other binding agents can made by methods well known in the art and used to evaluate their expression more specifically using any of the above detection methods.

[0084] One assay to examine patterns of expression of polynucleotides associated with a foraging behavior or polynucleotides that modulate the mammalian vestibular system, that does not require prior knowledge of their sequence, is mRNA differential display, which is described, for example, in Cirelli et al., Mol. Brain Res. 56:293 (1998) and Liang and Pardee, Mol. Biotech. 10:261-7 (1998). In such a method, RNA from the animal is reverse-transcribed and amplified by PCR using a particular combination of arbitrary primers. A detectable label, such as an enzyme, biotin, fluorescent dye or a radiolabel, is incorporated into the amplification products. The labeled products are then separated by size, such as on acrylamide gels, and detected by any method appropriate for detecting the label, including autoradiography, phosphoimaging or the like.

[0085] Such a method allows concurrent examination of expression of thousands of RNA species. Methods for determining which RNA species correspond to a polynucleotide associated with a foraging behavior, are disclosed herein, for example, comparing polynucleotide expression levels in invertebrates that exhibit different foraging behavior. It can be readily determined whether a particular compound alters a pattern of polynucleotide expression, such as by increasing or decreasing the intensity of bands corresponding to polynucleotides associated with a foraging behavior.

[0086] A further assay to examine patterns of expression of polynucleotides is array analysis, in which polynucleotides representative of all or a portion of the genome of an invertebrate or mammal, or representative of all or a portion of expressed polynucleotides of an invertebrate or mammal, are attached to a solid support, such as a filter, glass slide, chip or culture plate. Detectably labeled probes, such as cDNA probes, are then prepared from mRNA of an animal, and hybridized to the array to generate a characteristic, reproducible pattern of expression associated with, for example, a foraging behavior. It can be readily determined whether a particular candidate compound alters this pattern of polynucleotide expression, by detecting an increase or decrease in the amount of probe hybridized at one or more location on the array.

[0087] An expression profile used in the methods of the invention can be any read-out that provides a qualitative or quantitative indication of the expression or activity of a single polynucleotide or polypeptide, or of multiple polynucleotides or polypeptides. An expression profile can, for example, indicate the expression or activity of one, or of least 2, at least 5, at least 10, at least 20, at least 50, at least 100, at least 265, or more polynucleotides or polypeptides. An expression profile can, for example, indicate the expression or activity in a mammal of mammalian homologs of one or more polynucleotides or polypeptides associated with invertebrate foraging behavior. An expression profile can also, for example, indicate the expression or activity in an invertebrate of one more polynucleotides or polypeptides associated with invertebrate foraging behavior. An expression profile can indicate modulated expression or activity of one, a few, many, or all of the polypeptides or polynucleotides in the profile. Therefore, the methods of the invention can be used to identify modulated expression of 1 or more polynucleotide or polypeptide including, for example, 2 or more, 3 or more, 4 or more, 5 or more, 8 or more, 10 or more, 12 or more, 15 or more, 20 or more, 25 or more, or 50 or more polynucleotides or polypeptides.

[0088] The methods of the invention can be used to identify expression levels of any subset of polynucleotides or polypeptides desired to characterize a disease associated with NO/cGMP-dependent protein kinase network. Such a subset of polynucleotides or polypeptides can be identified by the methods of the invention. In addition a subset of polynucleotides or polypeptides, that has been previously identified, can be isolated and used in the methods of the invention or the methods can be directed to detect only members of the subset within a larger population. A subset of polynucleotides or polypeptides can be chosen based on functional linkage of the polynucleotides or polypeptides including, for example, interaction in a signal transduction system or a metabolic system; physical linkage of the polynucleotides including, for example, proximity on a chromosome; or correlated co-expression.

[0089] An expression profile can also be a quantitative or qualitative measure of expression of polypeptides encoded by one or more polynucleotides. Such assays generally involve binding of a detectable agent, such as an antibody or selective binding agent, to the polypeptide in a sample of cells or tissue from the animal. Protein assays include, for example, immunohistochemistry, immunofluorescence, ELISA assays, immunoprecipitation, immunoblot or other protein-blot analysis, and the like. Additional methods include two-dimensional gel electrophoresis, MALDI-TOF mass spectrometry, and ProteinChipm/SELDI mass spectrometry technology.

[0090] An expression profile can also be a direct or indirect measure of the biological activity of polypeptides encoded by one or more polynucleotides. A direct measure of the biological activity of a polypeptide can be, for example, a measure of its enzymatic activity, using an assay indicative of such enzymatic activity. An indirect measure of the biological activity of a polypeptide can be its state of modification (e.g. phosphorylation, glycosylation, or proteolytic modification) or localization (e.g. nuclear or cytoplasmic), where the particular modification or localization is indicative of biological activity. A further indirect measure of the biological activity of a polypeptide can be the abundance of a substrate or metabolite of the polypeptide, such as a neurotransmitter, where the abundance of the substrate or metabolite is indicative of the biological activity of the polypeptide. Appropriate assays for measuring enzyme activity, polypeptide modifications, or amounts of substrates or product of an enzymatic polypeptide, can be determined by one skilled in the art based on the biological activity of the particular polypeptide.

[0091] An appropriate method to use in determining an expression profile can be determined by those skilled in the art, and will depend, for example, on the number of polynucleotides being profiled; whether the method is performed in vivo or in a sample; the type of sample obtained; whether the assay is performed manually or is automated; the biological activity of the encoded polypeptide; the abundance of the transcript, protein, substrate or metabolite being detected; and the desired sensitivity, reproducibility and speed of the method.

[0092] An expression profile can be established in vivo, such as by diagnostic imaging procedures using detectably labeled antibodies or other binding molecules, or from a sample obtained from an individual. As changes in polynucleotide expression in the brain are likely to be most relevant to modulation of foraging behavior, appropriate samples can contain neural tissue, cells derived from neural tissues, or extracellular medium surrounding neural tissues, in which polypeptides to be detected or their metabolites are present. Thus, an appropriate sample for establishing an expression profile in humans can be, for example, cerebrospinal fluid, whereas in laboratory animals an appropriate sample can be, for example, a biopsy of the brain.

[0093] However, expression of polynucleotides can also be modulated in tissues other than neural tissue, and polypeptides or their metabolites can be secreted into bodily fluids. In particular, in the case of genetic disorders, an alteration in gene expression or function can be manifest in other cells in the body. Alternatively, a genetic disorder can be determined using any cell that contains genomic DNA, by detecting a mutation such as an insertion, deletion or modification of a gene associated with invertebrate foraging or a gene that modulates a mammalian vestibular system. An expression profile or presence of a genetic mutation can be determined from any convenient cell or fluid sample from the body, including blood, lymph, urine, breast milk, skin, hair follicles, cervix or cheek. Additionally, cells can readily be obtained using slightly more invasive procedures, such as punch biopsies of the breast or muscle, from the bone marrow or, during surgery, from essentially any organ or tissue of the body.

[0094] An expression profile can also be determined from cells in culture. These cells can be immortalized cells from a selected individual invertebrate or mammal, or can be cells from any known established invertebrate or mammalian cell line, such as those available from ATCC (Mannassas, Va.). The expression profile of these cells can be measured, for example, in the absence and presence of a compound. A compound that modulates the expression of an invertebrate polynucleotide associated with foraging behavior or of a mammalian polynucleotide substantially the same as an invertebrate polynucleotide associated with foraging behavior can be a compound that modulates ADHD, hypertension, or other disease associated with a NO/cGMP-dependent protein kinase network.

[0095] Following identification of patterns of polynucleotide or polypeptide expression, those skilled in the art can determine the sequence and if desired clone the polynucleotide using standard molecular biology approaches. For example, a polynucleotide identified by differential display can be isolated and sequenced, or used to probe a library to identify the corresponding cDNA or genomic DNA. Likewise, a polynucleotide from an array can be identified based on its known position on the array and used to amplify or clone the corresponding cDNA or genomic DNA.

[0096] If desired, any of the expression and activity assays described above can be used in combination, either sequentially or simultaneously. Such assays can also be partially or completely automated, using methods known in the art.

[0097] Samples of the invertebrate collected for measuring polynucleotide expression levels can include any organ known or suspected of influencing foraging. Exemplary organs can be found in the head, neck, legs and antennae, and include, for example, the brain or nervous system. Samples can be collected from an invertebrate at various occasions, including before and/or after feeding, before and/or after administration of a compound, or before and/or after participating in a measurement of a foraging behavior. Typically, samples are collected under the same conditions as the conditions that foraging measurements are carried out, for example at about the same time of day, about the same amount of time after feeding, about the same environmental conditions, and the like. Samples can also be collected immediately following measurement of foraging behavior. For example, samples from a first and a second invertebrate can be collected immediately after subjecting the first and second invertebrates to conditions in which the first invertebrate exhibits a foraging behavior different than a foraging behavior exhibited by the second invertebrate. Typically, a time period considered immediately after an exhibited behavior is less than 5 minutes after measuring foraging behavior, but the time period can be any amount of time considered by one skilled in the art to be immediate relative to the period of time in which expression of a polynucleotide or polypeptide of interest can change. In this regard, a polynucleotide or polypeptide whose expression changes rapidly will require shorter times between observation of a foraging behavior and measurement of polynucleotide expression compared to a polynucleotide or polypeptide having a slower rate of change in expression level.

[0098] Evaluation of expression can involve sacrificing the animal at a selected time and homogenizing the entire animal, or a portion thereof, such as the brain or a neuronal tissue. One or more polynucleotide or polypeptide molecules can then be extracted therefrom. Alternatively, such assays can be performed with a polypeptide or polynucleotide extracted from a biopsied tissue of an invertebrate.

[0099] According to the methods of the invention polynucleotide or polypeptide expression levels in an invertebrate can be compared to polynucleotide or polypeptide expression levels in a reference invertebrate. A reference invertebrate used in the methods of the invention can be chosen based on a variety of factors that can influence foraging behavior including, for example, strain, genotype, age, gender, developmental stage, presence or absence of defined mutations or polymorphisms, exposure to a compound or lack of exposure to a compound, or having been subjected to a particular condition or set of conditions during a foraging assay. For example, an invertebrate exposed to a compound can be compared to a reference invertebrate that has not been exposed to the compound. Alternatively, a reference invertebrate can be exposed to the compound to which the test invertebrate was exposed.

[0100] It is possible that an invertebrate used in the methods of the invention can exhibit substantially the same foraging behavior as a reference invertebrate before a compound is administered. It is also possible that an invertebrate used in the methods of the invention can exhibit a different foraging behavior from a reference invertebrate before a compound is administered. Following administration of a compound, an invertebrate may display a foraging behavior that is substantially the same or different from a foraging behavior displayed by a reference invertebrate that has been either exposed to the same compound, to another compound or not exposed to the compound. Additionally, an invertebrate can be its own reference and polynucleotide or polypeptide expression can be measured at different times or under different conditions.

[0101] One skilled in the art can choose a reference invertebrate for use in the methods of the invention according to factors suspected or known to affect the desired comparison. In cases where differences in conditions or effect of a compound are to be determined, it can be advantageous to use a reference invertebrate that is similar to the invertebrate being tested. In this regard, differential expression can be determined for invertebrates of the same strain that have been subjected to conditions in which a first group of members of a strain exhibit a foraging behavior different from the foraging behavior of a second group of members of the strain. Such conditions can be, for example, administration of a compound, presence or absence of food, feeding or starvation, and the like. Differential expression is then determined by measuring expression levels in the two groups and identifying polynucleotides expressed at significantly different levels.

[0102] Polynucleotides or polypeptides that are expressed at significantly different levels can be termed differentially expressed. Significantly different levels are levels that vary from each other by a diagnostic amount. A diagnostic amount can be, for example, an amount that is greater than the variability of expression levels between invertebrates that ideally would have identical expression levels (i.e., having identical genetic makeup, age, gender, raised under identical conditions, and the like). In such a situation, a significantly different level can be a difference that is greater than the mean difference observed between expression levels, or greater than the largest expression level difference observed between most or all polynucleotides in the ideally identical organisms. Alternatively, significantly different levels can be based on the composite variability of polynucleotide expression levels between two or more strains. For example, the mean or median difference between polynucleotide expression levels can be determined between a large number of different strains. Any difference in expression that is greater than the mean or median difference can be considered differentially expressed. Significantly different levels of expression can be identified as a minimum percent change including, for example, at least a 100% change, at least a 75% change, at least a 50% change, at least a 25% change or at least a 10% change in expression level or lower. Significantly different levels of expression that are even larger can be observed in the methods of the invention and can be identified as a minimum fold change, including for example, at least 2 fold change, at least a 3 fold change, at least a 4 fold change, at least a 5 fold change, at least a 10 fold change in expression level or higher. Other reference levels defining a significant difference can be determined by one of skill in the art according to the desired comparison between two or more invertebrates.

[0103] A gene that is differentially expressed in two invertebrate groups that exhibit different foraging behaviors can be considered a gene associated with invertebrate foraging behavior. It is understood that a polynucleotide associated with invertebrate foraging behavior is a polynucleotide whose expression is correlated with modulation of invertebrate foraging behavior. For example, a polynucleotide associated with foraging behavior can be a polynucleotide identified as more highly expressed in a Rover than in a sitter. The sequence and function of such an associated polynucleotide can be previously known or unknown. Exemplary invertebrate polynucleotides associated with invertebrate foraging behavior are foraging/dg2 (SEQ ID NO: 47); alcohol dehydrogenase (SEQ ID NO: 75); inositol polyphosphatel 1-phosphatase (SEQ ID NO: 48); inositol 1,4,5-tris-phosphate receptor (SEQ ID NO: 49); Dead Box-1 (SEQ ID NO: 50); CNS-specific protein Noe (SEQ ID NO: 51); cellular repressor of E1A-stimulated genes (SEQ ID NO: 77); 14-3-3 ε (SEQ ID NO: 52); casein kinase II α subunit (SEQ ID NO: 53); mRNA sequence similar to syntaxin I (SEQ ID NO: 54); ADP/ATP translocase/sesB (SEQ ID NO: 55); mitochondrial porin (SEQ ID NO: 56); neuron specific zinc finger transcription factor (scratch) (SEQ ID NO: 57); ecdysone-regulated (E93)(SEQ ID NO: 58); centrosomal and chromosomal factor (ccf) (SEQ ID NO: 59); activin β precursor (SEQ ID NO: 79); dynamin-like (SEQ ID NO: 81); paramyosin; mitochondrial ATP synthase α subunit; Fas-associated factor (FFAF)(SEQ ID NO: 87); lamin precursor (SEQ ID NO: 89); 18S, 5.8S, 2S and 28S rRNA genes (SEQ ID NO: 60); and ribosomal protein S6 gene (SEQ ID NO: 61). Additional exemplary polynucleotides associated with invertebrate foraging behavior are polynucleotides that contain a polynucleotide sequence selected from SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 104. Such a polynucleotide associated with invertebrate foraging behavior can be substantially the same as at least one mammalian polynucleotide that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent protein kinase network. Therefore, polynucleotides selected from SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO:95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104. can be substantially the same as polynucleotides that modulate ADHD, hypertension or other disease associated with NO/cGMP-dependent protein kinase network.

[0104] The invention provides a method of identifying a polynucleotide that correlates with a disease associated with nitric oxide/cGMP-dependent protein kinase network in a mammal. The method consists of (a) obtaining a first and a second strain of an invertebrate; (b) subjecting the first and second invertebrate strains to conditions in which the first strain exhibits a foraging behavior different than a foraging behavior exhibited by the second strain; (c) measuring an expression level for one or more polynucleotides in the first and second strains, and (d) identifying one or more polynucleotides that are differentially expressed in the first strain relative to the second strain, wherein a mammalian polynucleotide comprising substantially the same nucleic acid sequence as the one or more differentially expressed polynucleotides correlates with ADHD in a mammal.

[0105] Therefore, the invention provides a method of identifying a polynucleotide that correlates with ADHD in a mammal. The method consists of (a) obtaining a first and a second strain of an invertebrate; (b) subjecting the first and second invertebrate strains to conditions in which the first strain exhibits a foraging behavior different than a foraging behavior exhibited by the second strain; (c) measuring an expression level for one or more polynucleotides in the first and second strains, and (d) identifying one or more polynucleotides that are differentially expressed in the first strain relative to the second strain, wherein a mammalian polynucleotide comprising substantially the same nucleic acid sequence as the one or more differentially expressed polynucleotides correlates with ADHD in a mammal.

[0106] The invention also provides a method of identifying a polynucleotide that correlates with hypertension. The method consists of (a) obtaining a first and a second strain of an invertebrate; (b) subjecting the first and second invertebrate strains to conditions in which the first strain exhibits a foraging behavior different than a foraging behavior exhibited by the second strain; (c) measuring an expression level for one or more polynucleotides in the first and second strains, and (d) identifying one or more polynucleotides that are differentially expressed in the first strain relative to the second strain, wherein a mammalian polynucleotide comprising substantially the same nucleic acid sequence as the one or more differentially expressed polynucleotides correlates with hypertension in a mammal.

[0107] Conditions in which a first invertebrate strain can exhibit a foraging behavior different than a foraging behavior in a second invertebrate strain include, for example, environmental or biorhythmic factors that, when imposed on two different invertebrate strains, result in the two strains exhibiting dissimilar foraging behavior. Environmental factors include, for example, the physical environment such as temperature, pressure, light intensity, light quality, time of day, presence of food and the like; components of the gaseous environment such as humidity, % oxygen, presence of a compound such as a drug or hormone, and the like; and the structural makeup of the chamber in which the invertebrate is housed, including volume, particularly as it influences density of invertebrates, shape, composition of the chamber, and the like. Biological factors that can influence invertebrate foraging can include, for example, genetic factors such as presence of a particular allele, mutations that are either naturally occurring or induced in the laboratory; biorhythmic factors such as time of day, relative activity level of an invertebrate, length of time an invertebrate has been active, and the like; and biochemical factors such as developmental and hormonal state of an invertebrate, fasting state of the invertebrate, presence in the invertebrate of an administered compound, and the like. Further included are factors such as gender and age of the invertebrate.

[0108] Typically, foraging behavior experiments are carried out on adult invertebrates during the daytime, at least about two hours after sunrise and at least about two hours before sunset, and at least two hours after invertebrates have been at a relatively increased level of activity. Typical environmental conditions are about 22° C., 1 atmosphere, at ambient humidity and in a darkened room. An exemplary chamber is described in Example I and shown in FIG. 1.

[0109] Determination of a foraging behavior of a first invertebrate that is different when compared to a foraging behavior of a second invertebrate can be accomplished by analysis of foraging behavior for the two invertebrates. For example, in order for a foraging behavior of a first invertebrate to be different from a foraging behavior of a second invertebrate, the mean foraging measurement, typically termed the foraging score, of the first invertebrate will differ from the foraging score of the second invertebrate strain if a pairwise t-test of two scores is significantly different at the 0.05 level, or if multiple pairwise comparisons between strains are significantly different after applying a correction for experiment wise-error. A significantly different score refers to a score that is different by a statistically meaningful amount. In cases where foraging scores for populations of invertebrates are compared, two foraging scores are considered different if a first mean foraging score is not within a desired region of the probability distribution of the second foraging score. For example, a first mean foraging score can be different if it is not within the 80% probable region of a probability distribution of the second foraging score, or within the 85%, 90%, 95% or 98% probable region of the distribution of the second foraging score. Correspondingly, foraging scores considered to be substantially the same are foraging scores that do not differ by a more than a desired standard deviation or are within a desired probable region of a probability distribution. Methods for the determination of mean, standard deviation and characteristics of normal distributions are known in the art as demonstrated by texts such as Biostatistical Analysis 4th ed., Zar, Prentice-Hall Inc. (1999).

[0110] A mammalian polynucleotide or polypeptide that is associated with ADHD, hypertension or a disease associated with a NO/cGMP-dependent protein kinase network in a mammal can be identified by sequence homology with a polypeptide or polynucleotide of an invertebrate that is modulated in association with altered foraging behavior. A mammalian sequence that is substantially the same as an invertebrate sequence can be identified as a mammalian nucleic acid or corresponding amino acid sequence that has a high degree of homology to an invertebrate nucleic acid or corresponding amino acid sequence and at least one similar function.

[0111] According to the methods of the invention increased expression of a mammalian polynucleotide can correlate with an increase or decrease in severity of ADHD or hypertension. In addition decreased expression of a mammalian polynucleotide can correlate with an increase or decrease in severity of ADHD or hypertension. Thus, according to the methods of the invention, one or more polynucleotides that are differentially expressed in a first invertebrate strain relative to a second invertebrate strain can have increased expression and a mammalian polynucleotide having substantially the same sequence can have increased expression when involved in ADHD or hypertension in a mammal. Alternatively, a mammalian polynucleotide having substantially the same sequence as an invertebrate polynucleotide that demonstrates increased expression can have decreased expression when involved in ADHD or hypertension in a mammal. Additionally, one or more polynucleotides that are differentially expressed in a first invertebrate strain relative to a second invertebrate strain can have decreased expression and a mammalian polynucleotide having substantially the same sequence can have decreased expression when involved in ADHD or hypertension in a mammal. Alternatively, a mammalian polynucleotide having substantially the same sequence as an invertebrate polynucleotide that demonstrates decreased expression can have increased expression when involved in ADHD or hypertension in a mammal.

[0112] Polynucleotides that have substantially the same sequence can be identified, for example, by hybridization techniques where the polynucleotide molecules selectively hybridize via complementary base pairing under moderately stringent conditions or under highly stringent conditions. Stringency depends on a variety of factors including, for example, temperature, concentration of probe and/or target polynucleotide, ionic strength and pH. As known to those of skill in the art, the stability of hybrids is reflected in the melting temperature (T_(m)) of the hybrids. Typically, the hybridization reaction is performed under conditions of lower stringency, followed by washes of varying, but higher, stringency.

[0113] Moderately stringent hybridization refers to conditions that permit a target-polynucleotide to bind a complementary polynucleotide that has about 60% identity, preferably about 75% identity, more preferably about 85% identity to the target polynucleotide; with greater than about 90% identity to target-polynucleotide being especially preferred. Preferably, moderately stringent conditions are conditions equivalent to hybridization in 50% formamide, 5× Denhart's solution, 5× SSPE, 0.2% SDS at 42° C., followed by washing in 0.2×SSPE, 0.2% SDS, at 65° C.

[0114] High stringency hybridization refers to conditions that permit hybridization of only those polynucleotide sequences that form stable hybrids in 0.018M NaCl at 65° C. (i.e., if a hybrid is not stable in 0.018M NaCl at 65° C., it will not be stable under high stringency conditions, as contemplated herein). High stringency conditions can be provided, for example, by hybridization in 50% formamide, 5× Denhart's solution, 5×SSPE, 0.2% SDS at 42° C., followed by washing in 0.1×SSPE, and 0.1% SDS at 65° C.

[0115] Low stringency hybridization refers to conditions equivalent to hybridization in 10% formamide, 5× Denhart's solution, 6×SSPE, 0.2% SDS at 42° C., followed by washing in 1×SSPE, 0.2% SDS, at 50° C. Denhart's solution and SSPE (see, e.g., Sambrook et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, 1989) are well known to those of skill in the art as are other suitable hybridization buffers.

[0116] Polypeptides or polynucleotides encoding polypeptides that are substantially the same can also be identified as those having minor sequence modifications with respect to each other, so long as the polypeptides have similar functional activities. Two or more polynucleotides or the polypeptides encoded therefrom can have a variety of similar activities including, for example, immunogenicity, antigenicity, enzymatic activity, binding activity, or other biological property, including invertebrate foraging behavior-modulating activity.

[0117] A modification of a polynucleotide molecule can also include substitutions that do not change the encoded amino acid sequence due to the degeneracy of the genetic code. Such modifications can correspond to variations that are made deliberately, or which occur as mutations during polynucleotide replication. Additionally, a modification of a polynucleotide molecule can correspond to a splice variant form of the recited sequence.

[0118] Additionally, a fragment of a mammalian polynucleotide can be substantially the same as an invertebrate polynucleotide or a fragment of an invertebrate polynucleotide. A mammalian polynucleotide can be substantially the same as an invertebrate polynucleotide, for example, when one of several domains encoded by a mammalian polynucleotide corresponds to a domain encoded by an invertebrate protein. Such a fragment typically is encoded by at least 30 nucleotides, and the mammalian and invertebrate polynucleotides encoding that fragment share at least about 50% identity, at least about 60% identity, at least about 70% identity, at least about 80% identity, at least about 90% identity, at least about 95% identity or at least about 98% identity. Methods for determining that a fragment of a mammalian polynucleotide is substantially the same as an invertebrate polynucleotide or a fragment of an invertebrate polynucleotide include those described above for comparing mammalian and invertebrate polynucleotides. Such a fragment can be encoded by 30 or more nucleotides, for example, 45 or more nucleotides, 60 or more nucleotides, 90 or more nucleotides, 150 or more nucleotides, 210 or more nucleotides, or 300 or more nucleotides.

[0119] Biological functions retained by a fragment can include the ability to modulate ADHD, hypertension or a disease associated with a NO/cGMP-dependent kinase network in a mammal, the ability to modulate invertebrate foraging, the ability to bind an antibody that binds to a full-length protein from which the fragment was derived, or an enzymatic or binding activity characteristic of the full length protein. For example, peptides corresponding to 30 amino acid domains of CaMKII and PKC inhibit the respective full length enzymes as described in Kane et al., Neuron 18:307-314 (1997), Broughton et al., J. Cell. Biochem. 62:484-494 (1996), Broughton et al., J. Cell. Biochem. 60:584-600 (1996) and Griffith et al., Neuron 10:501-509 (1993).

[0120] Methods for determining that two sequences are substantially the same are well known in the art. For example, one method for determining if two sequences are substantially the same is BLAST, Basic Local Alignment Search Tool, which can be used according to default parameters as described by Tatiana et al., FEMS Microbial Lett. 174:247-250 (1999) or on the National Center for Biotechnology Information web page at ncbi.nlm.gov/BLAST/. BLAST is a set of similarity search programs designed to examine all available sequence databases and can function to search for similarities in amino acid or nucleic acid sequences. A BLAST search provides search scores that have a well-defined statistical interpretation. Furthermore, BLAST uses a heuristic algorithm that seeks local alignments and is therefore able to detect relationships among sequences which share only isolated regions of similarity including, for example, protein domains (Altschul et al., J. Mol. Biol. 215:403-410 (1990)).

[0121] In addition to the originally described BLAST (Altschul et al., supra, 1990), modifications to the algorithm have been made (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)). One modification is Gapped BLAST, which allows gaps, either insertions or deletions, to be introduced into alignments. Allowing gaps in alignments tends to reflect biologic relationships more closely. For example, gapped BLAST can be used to identify sequence identity within similar domains of two or more proteins. A second modification is PSI-BLAST, which is a sensitive way to search for sequence homologs. PSI-BLAST performs an initial Gapped BLAST search and uses information from any significant alignments to construct a position-specific score matrix, which replaces the query sequence for the next round of database searching. A PSI-BLAST search is often more sensitive to weak but biologically relevant sequence similarities.

[0122] A second resource that can be used to determine if two sequences are substantially the same is PROSITE, available on the world wide web at ExPASy. PROSITE is a method of determining the function of uncharacterized proteins translated from genomic or cDNA sequences (Bairoch et al., Nucleic Acids Res. 25:217-221 (1997)). PROSITE consists of a database of biologically significant sites and patterns that can be used to identify which known family of proteins, if any, the new sequence belongs. In some cases, the sequence of an unknown protein is too distantly related to any protein of known structure to detect similarity by overall sequence alignment. However, a protein that is substantially the same as another protein can be identified by the occurrence in its sequence of a particular cluster of amino acid residues, which can be called a pattern, motif, signature or fingerprint, that is substantially the same as a particular cluster of amino acid residues in the other protein including, for example, those found in similar domains. PROSITE uses a computer algorithm to search for motifs that identify proteins as family members. PROSITE also maintains a compilation of previously identified motifs, which can be used to determine if a newly identified protein is a member of a known protein family.

[0123] Therefore, the invention further provides a method of identifying a polypeptide involved in a disease associated with a NO/cGMP-dependent kinase network in a mammal. The method consists of (a) obtaining a first and a second strain of an invertebrate; (b) subjecting the first and second invertebrate strains to conditions in which the first strain exhibits a foraging behavior different than a foraging behavior exhibited by the second strain; (c) measuring polynucleotide expression levels in the first and second strains; (d) determining the amino acid sequence of a polypeptide encoded by one or more polynucleotides; and (e) identifying one or more polypeptides encoded by the one or more polynucleotides that are differentially expressed in the first strain relative to the second strain, whereby a mammalian polypeptide having substantially the same amino acid sequence as the one or more differentially expressed polypeptides is involved in a disease associated with a NO/cGMP-dependent kinase network in a mammal.

[0124] Therefore, the invention provides a method of identifying a polypeptide involved in ADHD or hypertension in a mammal. The method consists of (a) obtaining a first and a second strain of an invertebrate; (b) subjecting the first and second invertebrate strains to conditions in which the first strain exhibits a foraging behavior different than a foraging behavior exhibited by the second strain; (c) measuring polynucleotide expression levels in the first and second strains; (d) determining the amino acid sequence of a polypeptide encoded by one or more polynucleotides; and (e) identifying one or more polypeptides encoded by the one or more polynucleotides that are differentially expressed in the first strain relative to the second strain, whereby a mammalian polypeptide having substantially the same amino acid sequence as the one or more differentially expressed polypeptides is involved in ADHD or hypertension in a mammal.

[0125] The invention provides an isolated polynucleotide, or fragment thereof, having ADHD-altering activity in a mammal and having substantially the same nucleic acid sequence as a polynucleotide selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104. The invention also provides an isolated polynucleotide, or fragment thereof, having hypertension-altering activity in a mammal and having substantially the same nucleic acid sequence as a polynucleotide selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104.

[0126] Genetic methods of identifying new genes associated with invertebrate foraging behavior that are applicable to a variety of invertebrates are known in the art. For example, the invertebrate can be mutagenized using chemicals, radiation or insertions (e.g. transposons, such as P element mutagenesis), appropriate crosses performed, and the progeny screened for phenotypic differences in foraging behavior compared with normal controls. The gene can then be identified by a variety of methods including, for example, linkage analysis or rescue of the gene targeted by the inserted element. Genetic methods of identifying genes are described for Drosophila, for example, in Greenspan, Fly Pushing: The Theory and Practice of Drosophila Genetics, Cold Spring Harbor Laboratory Press (1997).

[0127] There are numerous important diagnostic, therapeutic, and screening applications that arise from identification of novel genes that modulate ADHD, hypertension or other diseases associated with a NO/cGMP-dependent kinase network in a mammal. For example, an expression or activity profile of one or many genes that modulate a disease can be established as a molecular fingerprint of the presence or severity of the disease. Thus, in diagnostic applications, it can readily be determined, by comparing the expression profile of an individual to one or more reference profiles, whether that individual suffers from, or is susceptible to, a particular disease. Likewise, the sensitivity of a NO/cGMP-dependent kinase network in a mammal and the effect of medications or medical procedures on the network in a mammal, can be determined at the molecular level. Such determinations allow for more appropriate determination and use of therapeutics for treating disorders such as ADHD or hypertension.

[0128] In screening applications, identification of genes that modulate ADHD, hypertension or other diseases associated with a NO/cGMP-dependent kinase network in a mammal allows novel compounds to be identified, lead compounds to be validated, and the molecular effects of these compounds and other known compounds to be characterized, by determining the effect of these compounds on an expression profile. For example, the ability of a compound, administered to an individual with a particular disorder, to alter the expression profile to correspond more closely to the profile of an unaffected or normal individual can be determined as described herein.

[0129] An isolated polynucleotide molecule of the invention that modulates ADHD, hypertension or other diseases associated with a NO/cGMP-dependent kinase network in a mammal can contain a sequence that is substantially the same as a sequence from a polynucleotide that is differentially expressed in invertebrate having different foraging behaviors. As described in Example II, SEQ ID NOS: 50-56 and SEQ ID NO: 77 correspond to polynucleotides having increased expression in Rovers as compared to sitters. In addition, SEQ ID NO: 77, SEQ ID NO: 79, and SEQ ID NO: 85 correspond to polynucleotides encoding polypeptides that are expressed at higher levels in Rovers as compared to sitters. Polypeptides that are expressed at higher levels in Rovers as compared to sitters include, for example, SEQ ID NO: 78, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 83 and SEQ ID NO: 86. As further described in Example II, SEQ ID NOS: 57-59 correspond to polynucleotides having decreased expression in Rovers as compared to sitters. In addition, SEQ ID NO: 75, SEQ ID NO: 87 and SEQ ID NO: 89 correspond to polynucleotides encoding polypeptides that are expressed at lower levels in Rovers as compared to sitters. Polypeptides that are expressed at lower levels in Rovers as compared to sitters include, for example, SEQ ID NO: 76, SEQ ID NO: 88 and SEQ ID NO: 90.

[0130] In accordance with the present invention, various polynucleotides identified by the methods of the invention are homologous to known mammalian polynucleotides including, for example, SEQ ID NOS: 47-61, SEQ ID NO: 75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, and SEQ ID NO: 89 which are homologous to SEQ ID NOS: 62-74, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102, and SEQ ID NO: 104 respectively as shown in Table 1. These polynucleotides and the polypeptides they encode are known in the art. One skilled in the art will be able to make and use these polynucleotides and their respective polypeptide products in the methods of the invention according to their known properties as described for example in the Genbank or flybase databases. Table 1 provides database accession numbers for these sequences. TABLE 1 Fly Homolog Human Homolog gene SEQ ID NO Genbank flybase SEQ ID NO Genbank foraging/dg2 47 M30413 62 Z92869 alcohol 75 M17837 FBgn0000055 91 NM_000667 dehydrogenase inositol 48 AF069513 FBgn0016672 63 NM_002194 polyphosphate1 1- phosphatase inositol 1,4,5- 49 Z18535 FBgn0010051 64 L38019 tri-phosphate receptor Dead Box-1 50 AF057167 FBgn0015075 65 NM_004939 CNS-specific 51 AF077736 FBgn0026197 66 HS321I20 protein Noe cellular repressor 77 AF084522 FBgn0025456 93 NM_003851 of E1A-stimulated genes 14-3-3 ε 52 U84898 FBgn0020238 67 U54778 casein kinase II α 53 M16534 FBgn0000258 68 J02853 subunit mRNA sequence 54 AF007164 69 U12918 similar to syntaxin I ADP/ATP 55 AI405031 FBgn0003360 70 NM_001151 translocase/sesB mitochondrial 56 AJ000880 FBgn0004363 71 L06132 porin neuron specific 57 U36477 FBgn0004880 72 NM_003425 zinc finger transcription factor (scratch) ectozoan-regulated 58 U25686 FBgn0013948 (E93) centrosomal and 59 AI544153 chromosomal factor (ccf) activin β 79 AF054822 FBgn0024913 95 A14422 precursor dynamic-like S17974 A40671 paramyosin P35416 P11055 mitochondrial ATP 85 Y07894 100 NM_004046 synthase α subunit Fas-associated 87 AB013610 FBgn0025608 102 KIAA0942 factor (FFAF) lamin precursor 89 X07278 FBgn0002525 104 NM_005572 18S, 5.8S, 2S and 60 21017 FBgn0016760 73 U13369 28S rRNA genes ribosomal protein 61 DRORBS6 FBgn0004922 74 NM_001010 S6 gene

[0131] The isolated polynucleotide molecules comprising SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 104. can hybridize to mammalian polynucleotides, and thus can be used in the diagnostic and screening methods described herein. Additionally, the isolated polynucleotide molecules containing sequences substantially the same as one of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 104. can be administered in gene therapy methods including, for example, antisense methods, to decrease expression of polypeptides that modulate ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. The isolated polynucleotide molecules of the invention can also be used as probes or primers to identify larger cDNAs or genomic DNA, or to identify homologs of the polynucleotide molecules in other species. The isolated polynucleotide molecules can further be expressed to produce polypeptides for use in producing antibodies or for designing or identifying inhibitory or stimulatory compounds. It is understood that the isolated polynucleotide molecules of the invention can be used for a variety of other uses known to those skilled in the art.

[0132] The invention also provides isolated polynucleotides containing at least 15 contiguous nucleotides of a nucleotide sequence referenced as SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, or an antisense strand thereof. An polynucleotide of the invention can include, for example, at least 15 contiguous nucleotides from the reference nucleotide sequence, can include at least 16, 17, 18, 19, 20 or at least 25 contiguous nucleotides, and often includes at least 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or more contiguous nucleotides from the reference nucleotide sequence. The isolated polynucleotides of the invention are able to specifically hybridize to polynucleotide molecules associated with invertebrate foraging or with modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal, under moderately or highly stringent hybridization conditions and thus can be advantageously used, for example, as probes in a diagnostic assay; as sequencing or PCR primers; as antisense reagents to administer to an individual to block gene expression; or in other applications known to those skilled in the art in which hybridization to a polynucleotide molecule is desirable.

[0133] In one embodiment, the invention provides a primer pair for detecting polynucleotide molecules associated with invertebrate foraging or with ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. The primer pair contains two isolated polynucleotides, each containing at least 15 contiguous nucleotides of one of the nucleotide sequences referenced as SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104., with one sequence annealing to the sense strand, and one sequence annealing to the anti-sense strand. The primer pair can be used, for example, to amplify polynucleotide molecules associated with invertebrate foraging or with modulation of a mammalian vestibular system by RT-PCR or PCR.

[0134] The isolated polynucleotide molecules of the invention can be produced or isolated by methods known in the art. The method chosen will depend, for example, on the type of polynucleotide molecule one intends to isolate. Those skilled in the art, based on knowledge of the nucleotide sequences disclosed herein, can readily isolate the polynucleotide molecules of the invention as genomic DNA, or desired introns, exons or regulatory sequences therefrom; as full-length cDNA or desired fragments therefrom; or as full-length mRNA or desired fragments therefrom, by methods known in the art.

[0135] A useful method for producing an isolated polynucleotide molecule of the invention involves, for example, amplification of the polynucleotide molecule using the methods such as the polymerase chain reaction (PCR) with polynucleotide primers specific for the desired polynucleotide molecule. Amplification procedures such as PCR or reverse-transcription PCR (RT-PCR) can be used to produce a polynucleotide molecule having any desired nucleotide boundaries. Desired modifications to the nucleic acid sequence can also be introduced by choosing an appropriate primer with one or more additions, deletions or substitutions. Such polynucleotide molecules can be amplified exponentially starting from as little as a single gene or mRNA copy, from any cell, tissue or species of interest.

[0136] A further method of producing an isolated polynucleotide molecule of the invention is by screening a library, such as a genomic DNA library, cDNA library or expression library, with a detectable agent. Such libraries are commercially available or can be produced from any desired tissue, cell, or species of interest using methods known in the art. For example, a cDNA or genomic library can be screened by hybridization with a detectably labeled polynucleotide molecule having a nucleotide sequence disclosed herein. Additionally, an expression library can be screened with an antibody raised against a polypeptide encoded by a polynucleotide disclosed herein. The library clones containing polynucleotide molecules of the invention can be isolated from other clones by methods known in the art and, if desired, fragments therefrom can be isolated by restriction enzyme digestion and gel electrophoresis.

[0137] Furthermore, isolated polynucleotide molecules of the invention can be produced by synthetic means. For example, a single strand of a polynucleotide molecule can be chemically synthesized by automated synthesis methods known in the art. The complementary strand can likewise be synthesized and a double-stranded molecule made by annealing the complementary strands. Direct synthesis is particularly advantageous for producing relatively short molecules, such as polynucleotide probes and primers, and polynucleotide molecules containing modified nucleotides or linkages. However, overlapping strands with complementary overhanging regions can be synthesized and annealed to create double stranded polynucleotides that are longer than those that can be efficiently synthesized as a single strand.

[0138] In one embodiment, the isolated polynucleotide molecules of the invention are attached to a solid support, such as a chip, filter, glass slide or culture plate, by either covalent or non-covalent methods. Methods of attaching polynucleotide molecules to a solid support, and the uses of polynucleotides in this format in a variety of assays, including manual and automated hybridization assays, are well known in the art. A solid support format is particularly appropriate for automated diagnostic or screening methods, where simultaneous hybridization to a large number of polynucleotides associated with invertebrate foraging or with modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal is desired, or when a large number of samples are being handled.

[0139] In another embodiment, the invention provides kits containing two or more isolated polynucleotide molecules. At least one polynucleotide molecule of the kit contains a nucleotide sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104., or minor modification thereof or at least 15 contiguous nucleotides of a nucleic acid sequence referenced as SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104. An exemplary kit is a solid support containing an array of isolated polynucleotide molecules of the invention, including, for example, at least 3, 5, 10, 20, 30, 40, 50, 75, 100, 265 or more isolated polynucleotide molecules.

[0140] A further exemplary kit contains one or more PCR primer pairs, or two or more hybridization probes, which optionally can be labeled with a detectable moiety for detection of polynucleotide molecules. The kits of the invention can additionally contain instructions for use of the molecules for diagnostic purposes in a clinical setting, or for drug screening purposes in a laboratory setting.

[0141] If desired, the kits containing two or more isolated polynucleotide molecules can contain polynucleotide molecules corresponding to genes that are up regulated in invertebrates exhibiting negative foraging behavior, or are down regulated in invertebrates exhibiting negative foraging behavior. Additionally, the kits containing two or more isolated polynucleotide molecules can contain polynucleotide molecules corresponding to sequences identified from Drosophila screens or other invertebrate screens, from rat screens, from screens in other mammals, or any combination thereof.

[0142] The invention also provides a vector containing an isolated polynucleotide molecule associated with invertebrate foraging or with modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. The vectors of the invention are useful, for example, for cloning and amplifying an isolated polynucleotide molecule, for recombinantly expressing a polypeptide, or in gene therapy applications.

[0143] Suitable expression vectors are well-known in the art and include vectors capable of expressing a polynucleotide operatively linked to a regulatory sequence or element such as a promoter region or enhancer region that is capable of regulating expression. Promoters or enhancers, depending upon the nature of the regulation, can be constitutive or inducible. The regulatory sequences or regulatory elements are operatively linked to a polynucleotide of the invention in an appropriate orientation to allow transcription of the polynucleotide.

[0144] Appropriate expression vectors include those that are replicable in eukaryotic cells and/or prokaryotic cells and those that remain episomal or those which integrate into the host cell genome. Suitable vectors for expression in prokaryotic or eukaryotic cells are well known to those skilled in the art as described, for example, in Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, Baltimore, Md. (2000). Vectors useful for expression in eukaryotic cells can include, for example, regulatory elements including the SV40 early promoter, the cytomegalovirus (CMV) promoter, the mouse mammary tumor virus (MMTV) steroid-inducible promoter, Moloney murine leukemia virus (MMLV) promoter, and the like. A vector useful in the methods of the invention can include, for example, viral vectors such as a bacteriophage, a baculovirus or a retro virus; cosmids or plasmids; and, particularly for cloning large polynucleotide molecules, bacterial artificial chromosome vectors (BACs) and yeast artificial chromosome vectors (YACs). Such vectors are commercially available, and their uses are well known in the art. One skilled in the art will know or can readily determine an appropriate promoter for expression in a particular host cell.

[0145] Appropriate host cells, include for example, bacteria and corresponding bacteriophage expression systems, yeast, avian, insect and mammalian cells and compatible expression systems known in the art corresponding to each host species. Methods for isolating, cloning and expressing polynucleotide molecules are well known in the art and are described, for example, in Sambrook et al., supra and in Ausubel et al., supra. The choice of a particular vector and host system for expression can be determined by those skilled in the art and will depend on the preference of the user.

[0146] Recombinant cells can be generated by introducing into a host cell a vector or population of vectors containing a polynucleotide molecule encoding a binding polypeptide. A recombinant cell can be produced by transducing, transecting or other means of introducing genetic material using a variety of methods known in the art to incorporate exogenous polynucleotides into a cell or its genome. Exemplary host cells that can be used include mammalian primary cells; established mammalian cell lines, such as COS, CHO, HeLa, NIH3T3, HEK 293 and PC12 cells; amphibian cells, such as Xenopus embryos and oocytes; and other vertebrate cells. Exemplary host cells also include insect cells such as Drosophila, yeast cells such as Saccharomyces cerevisiae, Saccharomyces ombe, or Pichia pastoris, and prokaryotic cells such as Escherichia coli.

[0147] In one embodiment, a polynucleotide of the invention can be delivered into mammalian cells, either in vivo or in vitro using suitable vectors well-known in the art. Suitable vectors for delivering a polynucleotide encoding a polypeptide to a mammalian cell, include viral vectors such as retro viral vectors, adenovirus, adeno-associated virus, lentivirus, herpes virus, as well as non-viral vectors such as plasmid vectors.

[0148] Viral based systems provide the advantage of being able to introduce relatively high levels of a heterologous polynucleotide into a variety of cells. Suitable viral vectors for introducing a polynucleotide encoding a polypeptide into mammalian cells are well known in the art. These viral vectors include, for example, Herpes simplex virus vectors (Geller et al., Science, 241:1667-1669 (1988)); vaccinia virus vectors (Piccini et al., Meth. Enzymology, 153:545-563 (1987)); cytomegalovirus vectors (Mocarski et al., in Viral Vectors, Y. Gluzman and S. H. Hughes, Eds., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1988, pp. 78-84)); Moloney murine leukemia virus vectors (Danos et al., Proc. Natl. Acad. Sci. USA, 85:6460-6464 (1988); Blaese et al., Science, 270:475-479 (1995); Onodera et al., J. Virol., 72:1769-1774 (1998)); adenovirus vectors (Berkner, Biotechniques, 6:616-626 (1988); Cotten et al., Proc. Natl. Acad. Sci. USA, 89:6094-6098 (1992); Graham et al., Meth. Mol. Biol., 7:109-127 (1991); Li et al., Human Gene Therapy, 4:403-409 (1993); Zabner et al., Nature Genetics, 6:75-83 (1994)); adeno-associated virus vectors (Goldman et al., Human Gene Therapy, 10:2261-2268 (1997); Greelish et al., Nature Med., 5:439-443 (1999); Wang et al., Proc. Natl. Acad. Sci. USA, 96:3906-3910 (1999); Snyder et al., Nature Med., 5:64-70 (1999); Herzog et al., Nature Med., 5:56-63 (1999)); retro virus vectors (Donahue et al., Nature Med., 4:181-186 (1998); Shackleford et al., Proc. Natl. Acad. Sci. USA, 85:9655-9659 (1988); U.S. Pat. Nos. 4,405,712, 4,650,764 and 5,252,479, and WIPO publications WO 92/07573, WO 90/06997, WO 89/05345, WO 92/05266 and WO 92/14829; and lentivirus vectors (Kafri et al., Nature Genetics, 17:314-317 (1997)).

[0149] The invention further provides transgenic non-human animals that are capable of expressing wild-type polynucleotides, dominant-negative polynucleotides, antisense polynucleotides, or ribozymes that target polynucleotides, where the polynucleotides are associated with invertebrate foraging or with modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. Such animals have correspondingly altered expression of polypeptides associated with invertebrate foraging or with modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal, and can thus be used to elucidate or confirm the function of such polypeptides, or in whole-animal assays to determine or validate the physiological effect of compounds that potentially modulate ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. The transgene may additionally comprise an inducible promoter and/or a tissue specific regulatory element, so that expression can be induced or restricted to specific cell types. Exemplary transgenic non-human animals expressing polynucleotides and polynucleotides that alter gene expression include mouse and Drosophila. Methods of producing transgenic animals are well known in the art.

[0150] The present invention provides an isolated polypeptide having ADHD-altering activity in a mammal, or fragment thereof, having substantially the same amino acid sequence as an amino acid sequence encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO:89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 104. The invention further provides an isolated polypeptide having hypertension-altering activity in a mammal, or fragment thereof, having substantially the same amino acid sequence as an amino acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 104. The invention also provides an isolated polypeptide having ADHD-altering activity or hypertension-altering activity in a mammal, or fragment thereof, comprising a polypeptide having a molecular weight of 50 kD and pI of 4.1, a molecular weight of 28 kD and pI of 8.7, a molecular weight of 36 kD and pI of 6.0, a molecular weight of 34 kD and pI of 6.3, a molecular weight of 25 kD and pI of 5.9, a molecular weight of 12 kD and pI of 5.7, a molecular weight of 12 kD and pI of 6.4, a molecular weight of 12 kD and pI of 6.4, or a molecular weight of 29 kD and pI of 6.5, wherein the molecular weight and isoelectric point (pI) are determined by 2 dimensional polyacrylamide gel electrophoresis using the methods described in Unlu et al., Electrophoresis 18:2071 (1997) in tandem with excision of polypeptide containing bands from other bands in the gel, in gel trypsin digestion, extraction, purification and analysis by MALDI-TOF as described in Helman et al. Anal. Biochem. 224:451 (1995). One skilled in the art will know that the above described molecular weights and isoelectric points, being determined from two dimensional polyacrylamide gel electrophoresis, can differ considerably from those predicted based on sequence data alone. Identification and characterization of polypeptides having the above described features is further described in Example II.

[0151] Isolated polypeptides of the invention can be used in a variety of applications. For example, isolated polypeptides can be used to generate specific antibodies, or in screening or validation methods where it is desired to identify or characterize compounds that alter the activity of polypeptides that with modulate a mammalian vestibular system.

[0152] The isolated polypeptides of the invention can be prepared by methods known in the art, including biochemical, recombinant and synthetic methods. For example, invention polypeptides can be purified by routine biochemical methods from neural cells or other cells that express abundant amounts of the polypeptide. Methods for isolating polypeptides are well known in the art as described, for example, in Scopes, Protein Purification: Principles and Practice, 3^(rd) Ed., Springer-Verlag, New York (1994); Duetscher, Methods in Enzymology, Vol 182, Academic Press, San Diego (1990), and Coligan et al., Current protocols in Protein Science, John Wiley and Sons, Baltimore, Md. (2000).

[0153] An invention polypeptide can also be produced by recombinant methods as described above. Recombinant methods involve expressing a polynucleotide molecule encoding the desired polypeptide in a host cell or cell extract, and isolating the recombinant polypeptide, such as by routine biochemical purification methods also described above. Methods for producing and expressing recombinant polypeptides in vitro and in prokaryotic and eukaryotic host cells are well known in the art as described, for example, in Goeddel, Methods in Enzymology, Vol 185, Academic Press, San Diego (1990); Wu, Methods in Enzymology, Vol 217, Academic Press, San Diego (1993); Sambrook et al., supra, and in Ausebel et al., supra. Furthermore, invention polypeptides can be produced by synthetic methods well known in the art including, for example, Merrifield solid phase synthesis, t-Boc based synthesis, Fmoc synthesis and variations thereof.

[0154] A polypeptide of the invention can accommodate minor modifications that can confer additional properties onto the polypeptide so long as such modifications do not inhibit the polypeptides activity as it relates to invertebrate foraging or modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. A modification can be made, for example, to facilitate identification and purification of the recombinant polypeptide. In this regard, it is often desirable to insert or add, in-frame with the coding sequence, polynucleotides that encode epitome tags or other binding sequences, or sequences that direct secretion of the polypeptide. A binding sequence that can be used to capture a polypeptide includes, for example, a biotinylation sequence, polyhistidine tag (Qiagen; Chatsworth, Calif.), antibody epitome such as the flag peptide (Sigma; St Louis, Mo.), glutathione-S-transferase (Amersham Pharmacia; Piscataway, N.J.), cellulose binding domain (Novagen; Madison, Wis.), calmodulin (Stratagene; San Diego, Calif.), staphylococcus protein A (Pharmacia; Uppsala, Sweden), maltose binding protein (New England BioLabs; Beverley, Mass.) or strep-tag (Genosys; Woodlands, Tex.) or minor modifications thereof. A modification can also be made to increase stability and can include, for example, incorporation of a cysteine to form a thioether cross-link, removal of a protease recognition sequence, addition of a charged amino acid to promote ionic interactions, or addition of a hydrophobic amino acid to promote hydrophobic interactions. In addition a polypeptide of the invention can be modified to incorporate additional amino acids, remove amino acids, substitute amino acids, chemically modified amino acids or incorporate non-natural amino acids.

[0155] An antibody specific for an isolated polypeptide of the invention is also provided. An antibody specific for a polypeptide of the invention can be a polyclonal or monoclonal antibody. Such antibodies can be used, for example, in diagnostic assays such as ELISA assays to detect or quantitate the expression of polypeptides of the invention; to purify polypeptides of the invention; or as therapeutic compounds to selectively target polypeptide of the invention. Such antibodies, if desired, can be bound to a solid support, such as a chip, filter, glass slide or culture plate. An antibody of the invention can be prepared and used according to methods that are well known in the art as described, for example, in Harlow and Lane, supra.

[0156] The invention provides diagnostic methods based on the newly identified and characterized polynucleotides described herein. In one embodiment, the invention provides a method of diagnosing ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. The method consists of determining an expression profile of the individual, and comparing that profile to a reference profile indicative of the particular disease. Correspondence between the profile of the individual and the reference profile indicates that the individual has the disorder. In one embodiment, at least one of the polynucleotides profiled is a polynucleotide containing a nucleic acid sequence substantially the same as one of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 104. Typically, at least one of the polynucleotides profiled is selected from the group consisting of foraging/dg2; alcohol dehydrogenase (SEQ ID NO: 76); inositol polyphosphatel 1-phosphatase; inositol 1,4,5-tri-phosphate receptor; Dead Box-1; CNS-specific protein Noe; cellular repressor of E1A-stimulated genes (SEQ ID NO: 78); 14-3-3 ε; casein kinase II α subunit; syntaxin I; ADP/ATP translocase/sesB ;mitochondrial porin; neuron specific zinc finger transcription factor (scratch); ectozoan-regulated (E93); centrosomal and chromosomal factor (ccf); activin β precursor (SEQ ID NO: 80); dynamic-like (SEQ ID NO: 81); paramyosin (SEQ ID NO: 83); mitochondrial ATP synthase α subunit (SEQ ID NO: 86); Fas-associated factor (FFAF) (SEQ ID NO: 88); lamin precursor (SEQ ID NO: 90); and ribosomal protein S6.

[0157] The methods of diagnosing ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal have numerous applications. Appropriate diagnosis of a such diseases will allow more effective treatments: using currently available treatments for ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal; using compounds identified from the screens described herein; using the therapeutic methods described herein; or any combination of these treatments. Likewise, methods of diagnosing ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal are applicable to monitoring the course of therapy for the disorder, such that appropriate modifications can be made if needed.

[0158] Furthermore, the methods of diagnosing ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal are applicable to screening for such diseases among the general population, or among populations in whom such diseases influence the safety of the individual or the general population. The diagnostic methods of the invention can also advantageously be used to characterize a previously unrecognized disease associated with a NO/cGMP-dependent kinase network in a mammal, or to newly categorize such a disease, based on characteristic patterns of expression or activity of polynucleotides associated with invertebrate foraging. Re-categorization of a disease can lead to new or alternate treatments to increase efficacy, reduce side effects or otherwise tailor treatment to the needs of an individual. The diagnostic methods of the invention can also be advantageously used to identify the specific polynucleotides most closely associated with, and thus likely to play a causative role, in ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal.

[0159] Those skilled in the art understand that the methods described herein for diagnosing ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can readily be applied to methods of screening for novel compounds; to methods of validating the efficacy of compounds identified by other methods to modulate ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal; to methods of determining effective dose, time and route of administration of known therapeutic compounds; to methods of determining the molecular mechanisms of action of known therapeutic compounds; and the like. Such methods can be performed in laboratory animals, such as mice, rats, rabbits, dogs, cats, pigs or primates, or in a clinical setting with humans.

[0160] The invention provides a method of modulating ADHD, hypertension or other disease associated with an NO/cGMP-dependent protein kinase network by administering to a mammalian subject an effective amount of a compound that modulates the expression of a polynucleotide selected from the group consisting of foraging/dg2 (SEQ ID NO: 47); alcohol dehydrogenase (SEQ ID NO: 75); inositol polyphosphatel 1-phosphatase (SEQ ID NO: 48); inositol 1,4,5-tri-phosphate receptor (SEQ ID NO: 49); Dead Box-1 (SEQ ID NO: 50); CNS-specific protein Noe (SEQ ID NO: 51); cellular repressor of E1A-stimulated genes (SEQ ID NO: 77); 14-3-3 ε (SEQ ID NO: 52); casein kinase II α subunit (SEQ ID NO: 53); mRNA sequence similar to syntaxin I (SEQ ID NO: 54); ADP/ATP translocase/sesB (SEQ ID NO: 55); mitochondrial porin (SEQ ID NO: 56); neuron specific zinc finger transcription factor (scratch) (SEQ ID NO: 57); ectozoan-regulated (E93)(SEQ ID NO: 58); centrosomal and chromosomal factor (ccf) (SEQ ID NO: 59); activin β precursor (SEQ ID NO: 79); dynamic-like (SEQ ID NO: 81); paramyosin (SEQ ID NO: 83); mitochondrial ATP synthase a subunit (SEQ ID NO: 85); Fas-associated factor (FFAF)(SEQ ID NO: 87); lamin precursor (SEQ ID NO: 89); 18S, 5.8S, 2S and 28S rRNA genes (SEQ ID NO: 60); and ribosomal protein S6 gene (SEQ ID NO: 61).

[0161] The invention also provides a method of modulating ADHD, hypertension or other disease associated with an NO/cGMP-dependent protein kinase network by administering to a mammalian subject an effective amount of a compound that modulates the activity or expression of a polypeptide selected from the group consisting of foraging/dg2; alcohol dehydrogenase (SEQ ID NO: 76); inositol polyphosphatel 1-phosphatase; inositol 1,4,5-tri-phosphate receptor; Dead Box-1; CNS-specific protein Noe; cellular repressor of ElA-stimulated genes (SEQ ID NO: 78); 14-3-3 ε; casein kinase II α subunit; syntaxin I; ADP/ATP translocase/sesB ; mitochondrial porin; neuron specific zinc finger transcription factor (scratch); ectozoan-regulated (E93); centrosomal and chromosomal factor (ccf); activin β precursor (SEQ ID NO: 80); dynamic-like (SEQ ID NO: 81); paramyosin (SEQ ID NO: 83); mitochondrial ATP synthase α subunit (SEQ ID NO: 86); Fas-associated factor (FFAF)(SEQ ID NO: 88); lamin precursor (SEQ ID NO: 90); and ribosomal protein S6.

[0162] In addition, the invention provides a method of modulating ADHD, hypertension or other disease associated with an NO/cGMP-dependent protein kinase network by administering to a mammalian subject an effective amount of a compound that modulates the expression of a polynucleotide selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104. or the activity or expression of a polypeptide encoded by a polynucleotide selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO: 104.

[0163] Methods of treating an individual are intended to include preventing, ameliorating, curing, and reducing the severity of a disease or symptoms associated with a disease. Those skilled in the art understand that any degree of reduction in severity of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can improve the health or quality of life of an individual. The effect of the therapy can be determined by those skilled in the art, by comparison to baseline values for symptoms or clinical or diagnostic markers associated with the disorder.

[0164] A compound identified by the methods of the invention can be administered to a mammal for the purpose of treating ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. An effective amount of a compound to treat ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal is an amount of the compound required to effect a decrease in a symptom or severity of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal respectively. The dosage required to be therapeutically effective will depend, for example, on the particular disease, the route and form of administration, the weight and condition of the individual, and previous or concurrent therapies. The appropriate amount considered to be an effective dose for a particular application of the method can be determined by those skilled in the art, using the guidance provided herein. For example, the amount can be determined from diagnostic or gene expression assays described herein. One skilled in the art will recognize that the condition of the patient can be monitored throughout the course of therapy and that the amount of compound that is administered can be adjusted accordingly.

[0165] An effective amount can be, for example, between about 10 μg/kg to 500 mg/kg body weight, for example, between about 0.1 mg/kg to 100 mg/kg, or preferably between about 1 mg/kg to 50 mg/kg, depending on the treatment regimen. For example, if a compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal is administered from one to several times a day, then a lower dose would be needed than if a formulation were administered weekly, or monthly or less frequently. Similarly, formulations that allow for timed-release of such a compound would provide for the continuous release of a smaller dose than would be administered as a single bolus dose. For example, a compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can be administered at between about 1-5 mg/kg/week.

[0166] A compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can be delivered systemically, such as intravenously or intra arterially. Such a compound can also be administered locally at a site of the pathological condition. Appropriate sites for administration of a compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal are known, or can be determined, by those skilled in the art depending on the clinical indications of the individual being treated. For example, a compound can be administered in the brain or nervous system of an individual having ADHD or in the vasculature of an individual having hypertension. A compound can be provided in a substantially purified form in pharmaceutically acceptable formulations using formulation methods known to those of ordinary skill in the art. These formulations can be administered by standard routes, including for example, topical, transdermal, intra peritoneal, intra cranial, intracerebroventricular, intra cerebral, intra vaginal, intrauterine, oral, rectal or parenteral (e.g., intravenous, intra spinal, subcutaneous or intramuscular) routes. Methods for such routes of administration are well known to those skilled in the art.

[0167] In addition, a compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can be incorporated into biodegradable polymers allowing for sustained release of the compound, the polymers being implanted in the vicinity of where drug delivery is desired, for example, in the brain for an ADHD patient or in a vascular tissue for a hypertension patient. Osmotic minipumps also can be used to provide controlled delivery of specific concentrations of such compounds and formulations through cannulae to the site of interest, such as directly into a nervous or vascular tissue. The biodegradable polymers and their use are described, for example, in detail in Brem et al., J. Neurosurg. 74:441-446 (1991).

[0168] A compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can be administered as a solution or suspension together with a pharmaceutically acceptable medium in such a manner to ensure proper distribution in vivo. For example, the blood-brain barrier excludes many highly hydrophilic compounds. To ensure that the compounds intended to treat ADHD cross the blood brain barrier, they can be formulated, for example, in liposomes, or chemically derivatized. Other pharmaceutically acceptable media include, for example, water, sodium phosphate buffer, phosphate buffered saline, normal saline or Ringer's solution or other physiologically buffered saline, or other solvent or vehicle such as a glycol, glycerol, an oil such as olive oil or an injectable organic ester so long as the formulation is of sufficient purity and quality for use in humans, sterile and substantially free from contaminating particles and organisms.

[0169] A pharmaceutically acceptable medium can additionally contain physiologically acceptable compounds that act, for example, to stabilize the compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. Such physiologically acceptable compounds include, for example, carbohydrates such as glucose, sucrose or dextrans; antioxidants such as ascorbic acid or glutathione; chelating agents such as EDTA, which disrupts microbial membranes; divalent metal ions such as calcium or magnesium; low molecular weight proteins; lipids or liposomes; or other stabilizers or excipients.

[0170] A compound that modulates ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can be administered in conjunction with other therapies. For example, for treating ADHD, the compound can be administered prior to, during, or subsequent to other ADHD treatments including, for example, behavior modification or drug therapies such as methylphenidate (ritalin), amphetamines, pemoline or cylert. Similarly, for treating hypertension the compound can be administered prior to, during, or subsequent to treatments, such as controlled regimens of diet or exercise or drug therapies such as diuretics, ACE inhibitors, beta blockers, vasodilators or calcium channel blockers. For a description of treatments for ADHD or hypertension see, for example, The Merck Manual, Sixteenth Ed, (Berkow, R., Editor) Rahway, N.J., 1992.

[0171] It will be understood that the efficacy and safety of a compound in laboratory mammals can be evaluated before administering the compound to humans or veterinary animals. For example, the compound can be tested for its maximal efficacy and any potential side-effects using several different invertebrates or laboratory mammals, across a range of doses, in a range of formulations, and at various times of day, such as before or after sleeping, before or after eating, and the like. Generally, a compound identified using the methods of the invention will cause few or no deleterious or unwanted side effects.

[0172] Thus, in one embodiment, the invention provides a method of determining the efficacy of a compound in treating ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal. The method consists of administering a compound to an individual having ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal, and determining an effect of the compound on the expression profile of the individual. A compound that modulates the expression profile of the individual to correspond to an unaffected or normal profile indicates that the compound is effective in treating ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal.

[0173] Once genes associated with ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal have been identified, the expression or activity of such genes in humans or other mammals can be selectively targeted in order to prevent or treat the disease. The diagnostic, screening and validation methods of the invention are useful in determining appropriate genes to target and appropriate therapeutic compounds to use for a particular indication.

[0174] If desired, treating an individual having ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal can be achieved with gene therapy. Methods of introducing and expressing polynucleotides in animals, including humans, are well known in the art. For example, gene therapy methods can be performed by ex vivo methods, wherein cells (e.g. hematopoietic cells, including stem cells) are removed from the body, engineered to express a polypeptide associated with invertebrate foraging or with modulation of ADHD, hypertension or other disease associated with a NO/cGMP-dependent kinase network in a mammal, and returned to the body. Gene therapy methods can also be performed by in situ methods, such that an expressible polynucleotide molecule is placed directly into an appropriate tissue, such as the brain or CNS, by a direct route such as injection or implantation during surgery. Gene therapy methods can also be performed in vivo, wherein the expressible polynucleotide molecule is administered systemically, such as intravenously. Appropriate vectors for gene therapy can be determined by those skilled in the art for a particular application of the method, and include, but are not limited to, retro viral vectors (e.g. replication-defective MuLV, HTLV, and HIV vectors); adeno viral vectors; adeno-associated viral vectors; herpes simplex viral vectors; and non-viral vectors. Appropriate formulations for delivery of polynucleotides can also be determined by those skilled in the art, and include, for example, liposomes; polycationic agents; naked DNA; and DNA associated with or conjugated to targeting molecules (e.g. antibodies, ligands, lectins, fusogenic peptides, or HIV tat peptide). Gene therapy methods, including considerations for choice of appropriate vectors, promoters, formulations and routes of delivery, are reviewed, for example, in Anderson, Nature 392:25-30 (1998).

[0175] It is understood that modifications which do not substantially affect the activity of the various embodiments of this invention are also provided within the definition of the invention provided herein. Accordingly, the following examples are intended to illustrate but not limit the present invention.

EXAMPLE I

[0176] Measuring Foraging in for^(R) and for^(S) Strains of Drosophila

[0177] This example shows the measurement of foraging scores for strains of Drosophila that have varying foraging behaviors.

[0178] Drosophila lines with various alleles of the foraging locus (for), encoding one of two forms of cGMP-dependent protein kinase were used. The for^(R) strain, termed Rover, is a naturally occurring allelic variant which demonstrates high activity in the presence of food or in the fed state. The for^(S) strain, termed sitter, is also a naturally occurring variant and shows low activity in the presence of food or in a fed state. The naturally occurring for^(R) and for^(S) strains were obtained from M. Sokolowski (York University, North York, Ontario, Canada) and have been described previously, for example, in de Belle et al., Heredity 59:73 (1987). In addition, a variant demonstrating reduced activity in the presence of food, termed for^(s2), was used. The for^(s2) variant was produced by X-ray induced mutation at the for locus and does not complement the fo^(R) or for^(S) alleles as described, for example, in de Belle et al. Genetics 123:157 (1989)

[0179] These strains were tested in a foraging maze constructed according to the design of Hirsch (J. Comp. Physiol. Psychol. 52:304-308 (1959)) as modified by McMillan and McGuire (Behav. Genet. 22:557-573 (1992)), shown in FIG. 1.

[0180] The foraging maze was fashioned from plexiglass and the maze chambers were hollowed out as semi-circular depressions in the slab, such that when the two slabs were bolted together a circular tube was formed in maze. This design thus made it possible to observe the entire time course of the assay. The maze is placed flat on a uniform light source so that each path from the start tube to a collection tube is equal distant and not subject to an elevation gain and the maze is evenly illuminated throughout.

[0181] For each strain, a foraging score was determined. In separate measurements, twenty to thirty flies, 3-5 days old were starved overnight, allowed to briefly (15 minutes) feed on 0.25 M sucrose, then placed into the start tube of the maze. After 5 minutes, the number of flies that had reached any of the final collection tubes was summed and expressed as a percentage of the total number tested to yield the “foraging score.” Standard errors were also calculated (SEM). The foraging measurements were repeated 4-5 times (N) for each strain (see Table 2). TABLE 2 N (20-30 flies Strain in each) Foraging score SEM for^(R) 4 76.1 10.3 for^(s) 5 36.7 14.8 for^(s2) 4 30.9 11.2

[0182] As shown in Table 2 the for^(R) strain showed a roughly 2 fold increase in foraging score as compared to the for^(S) strain and a roughly 2.5 fold increase in foraging score as compared to the for^(s2) strain.

[0183] These results demonstrate an assay useful in quantitatively determining a foraging behavior for a group of flies. These results further demonstrate statistically significant differences in foraging behaviors for Rovers and sitters.

EXAMPLE II

[0184] Molecular Components of the NO/cGMP-Dependent Protein Kinase as Identified in Drosophila

[0185] This example demonstrates the identification of polynucleotides and polypeptides that are differently expressed in Drosophila having a for^(R) allele compared to Drosophila that are homozygous for the for^(s2) allele.

[0186] The genetic differences between for^(R) and for^(s2) were used as a basis for identifying genes in the NO/cGMP-dependent protein kinase network based on their differential expression between these two strains. Differential display of mRNA was used to assay differences in mRNA levels and 2 dimensional polyacrylamide gel electrophoresis in tandem with mass spectrometry was used to identify differences in protein expression levels.

[0187] For each strain, twenty flies, 3-5 days old were starved overnight then allowed to feed on 0.25 M sucrose for 15 minutes prior to decapitation. For mRNA differential display, 20 fly heads were homogenized in TRIzol (life Technologies, Inc. Frederick, Md.) and extracted according the manufacturer's instructions. Differential display of mRNA was performed as described in Cirelli et al., supra. Sequences were analyzed using the BLAST program at either the Berkeley Drosophila Genome Project or NCBI websites.

[0188] For protein analysis, 35 fly heads were homogenized and separated by 2 dimensional polyacrylamide gel electrophoresis using the methods described in Unlu et al., Electrophoresis 18:2071 (1997). Spots were matched between gels and those showing the strongest differences were then excised with a scalpel and subjected to in gel trypsin digestion, extraction, purification and then analyzed by MALDI-TOF as described in Helman et al. Anal. Biochem. 224:451 (1995). The resulting spectra were analyzed by matching peptide patterns to those in the Profound database administered by the Rockefeller University.

[0189] A number of mRNA sequences that were differentially expressed could be identified as expressed sequence tags (EST's) in the Berkeley Drosophila Genome Project or NCBI databases and are provided as SEQ ID NOS: 1-46. In addition a number of previously cloned Drosophila genes were identified as shown in Table 3. TABLE 3 Change in for^(R) vs. for^(s2) differentially expressed mRNA increase DEAD-box protein (Ddx1) increase CNS-specific protein Noe (noe) increase 14-3-3 ε increase cellular repressor of E1A- stimulated genes (CREG) increase casein kinase II α subunit increase mRNA sequence similar to syntaxin I increase ADP/ATP translocase increase mitochondrial porin decrease neuron specific zinc finger transcription factor (scratch) decrease ectozoan-regulated (E93) decrease centrosomal and chromosomal factor (ccf)

[0190] A number of proteins that were differentially expressed could be identified as shown in Table 4. TABLE 4 Change in for^(R) vs. for^(s2) differentially expressed increase cellular repressor of E1A- stimulated genes (CREG) increase activin β precursor increase dynamic-like (S17974) increase paramyosin increase mitochondrial ATP synthase α subunit decrease Fas-associated factor (FFAF) decrease lamin precursor decrease alcohol dehydrogenase (Adh)

[0191] Additionally, a number of proteins were identified according to mass and isoelectric point (pI) as shown in table 5. TABLE 5 molecular weight sample number (kD) pI  1 50 4.1  4 28 8.7  8 36 6.0  9 34 6.3 11 25 5.9 13 12 5.7 14 12 6.4 18 29 6.5

[0192] The mRNA sequences and polypeptides identified in this example represent targets for modulation of ADHD, hypertension or other diseases associated with a NO/cGMP-dependent protein kinase network in a mammal. In addition the mRNA and polypeptides identified in this example can be used as a molecular fingerprint or diagnostic of ADHD, hypertension or other diseases associated with a NO/cGMP-dependent protein kinase network in a mammal. Because differences can be observed between naturally tolerable phenotypes (i.e. changes from Rover to sitter behavior) with no loss of fitness or health this fingerprint can be used to determine the extent to which a test compound will modulate foraging and therefore ADHD, hypertension or other diseases associated with a NO/cGMP-dependent protein kinase network in a mammal.

EXAMPLE III

[0193] Effects of an Alcohol Dehydrogenase Allele Adh^(n1) on Foraging

[0194] This example demonstrates use of a foraging assay to determine the effects of various alleles, independent of or in combination with Rover/sitter, on the activity of Drosophila in response to feeding and starvation.

[0195] The Adh^(n1) allele of the alcohol dehydrogenase locus results in 20% of the normal level of alcohol dehydrogenase activity as described in Chenevert et al., Biochem. J. 308:419 (1995). Adh^(n1) flies were obtained from The Drosophila Stock Center, Dept. of Biology, University of Indiana, Bloomington Ind. Doubly homozygous mutants were made with Rover or sitter and a mutant of the ipp locus (ipp²) as described in Greenspan, supra. A mutant of another component of the inositol phosphate signaling pathway, the inositol 1,4,5-tri-phosphate receptor (Itp-r83A; Acharya et al. Neuron 18:881 (1997)) was also tested for its activity on foraging. The ipp² and Itp-r83A flies were obtained from C. Zuker, Dept of Biology, University of California, San Diego.

[0196] All strains were assayed for foraging behavior as described in Example I. Foraging scores are presented in Table 6. TABLE 6 Strain Fed starved starved then fed for^(R) 82.4 54.8 76.1 for^(s2) 48.0 39.2 30.9 Adh^(n1) 8.9 15.0 31.1 for^(R); ipp² 100.0 70.8 45.5 for^(s2); ipp² 40.0 33.3 38.2 Itp-r83A¹ 8.9 23.3 6.7

[0197] As shown in Table 6, While less active than Rover or Sitter under normal or starved conditions, Adh^(n1) produces a Rover-like increase in its foraging in response to starvation followed by brief feeding. This is consistent with what would be expected based on the reduction in And levels seen in Rovers vs. sitters in Example II.

[0198] Also shown in Table 6, doubly homozygous mutants of for^(R);ipp² showed a dramatic alteration of Rover phenotype after being starved then fed when compared to for^(R) flies. However, very little effect was observed on the sitter phenotype for doubly homozygous for^(s2); ipp² when compared to for^(s2) flies under any condition. This specific interaction with Rover identifies ipp as a target for modulating foraging behavior in invertebrates and therefore ADHD, hypertension or a disease associated with a NO/cGMP-dependent protein kinase in a mammal.

[0199] As further shown in Table 6, a mutant of the inositol phosphate signaling pathway, the inositol 1,4,5-tri-phosphate receptor (Itp-r83A) also shows an influence on foraging, exhibiting a phenotype opposite that of Rover. More specifically, Itp-r83A flies displayed more activity under conditions of starvation than under either of the other conditions. This identifies Itp-r83A as a target for modulating foraging behavior in invertebrates and therefore ADHD, hypertension or a disease associated with a NO/cGMP-dependent protein kinase in a mammal.

EXAMPLE IV

[0200] Changes in Foraging Behavior Induced by Ritalin

[0201] This Example demonstrates biological activity of methylphenidate (ritalin) in Drosophila Melanogaster and its effects on sleep were evaluated

[0202] Methyiphenidate is a biologically active compound in mammals that is used clinically to ameliorate symptoms associated with attention-def icit/hyperactivity disorder (ADHD). Methylphenidate can also be used to treat excessive daytime sleepiness associated with many sleep disorders including narcolepsy. The therapeutic role of this medication has proven effective not only for treating excessive sleepiness but also for improving associated deficits in both affect and cognition. In order to determine whether methylphenidate is biologically active in an invertebrate, methylphenidate was administered to Drosophila melanogaster and its effects on sleep were evaluated.

[0203] Wild-type Canton-S (CS) flies were placed into a 50 ml vial containing yeast, dark corn syrup and agar food. Each colony was housed at 25° C. with a relative humidity of 50% in a Forma Scientific incubator. The flies were maintained on a 12:12 light/dark cycle with “lights-on” commencing at 8:00 am and “lights-off” commencing at 8:00 pm to yield darkness. Virgin females were collected and maintained in 50 ml vials for one day. During the second day, flies were individually placed into glass tubes (65 mm in length, 5 mm I.D.) containing about lomm of fresh food and the tubes were placed into the Drosophila Activity Monitoring System (Trikinetics, Waltham, Mass.). The flies remained undisturbed for the following 48 hours. The amount and distribution of rest was evaluated during the third day to ensure that the flies had adapted to the apparatus as previously described in Shaw et. al., Science 287:1834-1837 (2000). On day 4, two hours before lights-off, flies were transferred to fresh glass tubes containing only standard laboratory food or to tubes with 0.5 mg/ml methylphenidate that had been dissolved in the food. The amount and distribution of rest during the following 14 hours was then evaluated.

[0204] The amount and distribution of sleep during baseline was similar to previously published results in all treatment groups Shaw et. al., Science 287:1834-1837 (2000). Flies that were transferred to normal food two hours before the beginning of the main rest period did not show any reduction in sleep over the course of the 12 hour dark period when compared to flies that remained in the original tube as shown in FIG. 2. The similarity in rest pattern for both baseline and transferred flies indicates that the transfer process did not alter the sleep-wake cycle. In contrast, flies that were transferred to tubes containing 0.5 mg/ml methylphenidate showed a significant and substantial decrease in rest that increased throughout the rest period (p<0.05) as shown in FIG. 3.

[0205] These results demonstrate that methylphenidate is biologically active in invertebrates, increasing waking and reducing sleep during the normal sleep period in the fruit fly. Methylphenidate has an effect in treating excessive daytime sleepiness associated with narcolepsy in humans. Furthermore, the administration of methylphenidate to normal subjects during their typical sleep times indicates that it can effectively used to reduce sleep drive. In addition, methylphenidate can alleviate the decline in cognitive functioning associated with decreased levels of arousal induced by chronic sleep deprivation. Therefore, these results demonstrate that methylphenidate has similar biological effect in both mammals and invertebrates.

[0206] Throughout this application various publications have been referenced. The disclosure of these publications in their entireties are hereby incorporated by reference in this application in order to more fully describe the state of the art to which this invention pertains.

[0207] Although the invention has been described with reference to the disclosed embodiments, those skilled in the art will readily appreciate that the specific details are only illustrative of the invention. It is understood that modifications which do not substantially affect the activity of the various embodiments of this invention are also included within the definition of the invention provided herein. Therefore, it should be understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the following claims.

1 105 1 274 DNA Drosophila melanogaster 1 gggctagtct gcaggttaac gaatcgcctt cggaatcggt tttbbbtttt cttagttagt 60 tttgtaagca tttttaaact aaagccaagt ctaggttaag ttttgtacaa atgcgtgggc 120 aaagtgcacg attgttgaaa agttggttaa gtgagtagca agtagataaa gcaaatttva 180 gctaaaavgt aacagcaacg gttgagagta aattttgcta gamatatacg tatggtaaag 240 cssgttgttt attagccaat cttcaattat acaa 274 2 442 DNA Drosophila melanogaster 2 taaagggact agtctgcagg tttaaacgaa ttcgcccttc ggaattcggt tttttttttt 60 cggtttgcaa ttatatttma tggaaaactt gaaaagattg tytttgcgtt ggaattgaac 120 tggagaagtc gaatgtttgt tgtggcgtgt tagtgttata ttcgaaaata attatgcatc 180 atargktaga cttttybcat ttgagttatt tstggttttc ataacttagt tcttgcttta 240 caacgatatg tttgcttctt catcatcatc attatcatct tcttgttctt gttttcctgg 300 ggattctatt cactaatatm tcgccctaag kytttcgatt tcgbttaatg catttcgctt 360 ttggattaga gtaggctaga ctagagttga gttgagtaga gcctagagtt gagtagagag 420 catagttcaa gcatatgsmm gt 442 3 322 DNA Drosophila melanogaster misc_feature (1)...(322) n = A,T,C or G 3 gggactagtc ctgcaggttt aaacgaattc gcccttcgga attcggtttt ngndttttct 60 gtctggttgg ggatgcatta gagcattagc tttttaactt ctggttccat tttatcagtg 120 atttcgattt gcgtatctct gcgctttttt tttttggttg tggtttaatt agttgaaaag 180 gcactgctgc attcaaacta ttccatttbg ggstttatbb sgtttagscc cccggatctt 240 gaccccttag tctcatttgg taaatttatg tttagaatac acaattnatt bbngvtttcc 300 ttctagtgca cctgacctaa tt 322 4 46 DNA Drosophila melanogaster 4 taagggcatr tctgccagtt tacrgattgc ctyggtgaat tggcac 46 5 52 DNA Drosophila melanogaster 5 taaagggact agtctgcagg tttaaacgaa ttcgccttcg gaattcggtt tt 52 6 378 DNA Drosophila melanogaster 6 taaagggmct agtctgcagg tttaaacgaa ttcgcccttc ggaattcggt tttttttttt 60 tccctttctc ggctgggaca acaattcgct ttggcttcac accaaatttg atttcattcg 120 caagacgctc tcaggttaaa tggtgcagaa agtcctkttg acttctcgac tggctgaccs 180 acgcccaagt aggatctcga aatgaattta aargccacct ttgtgcacgc taaaaaaagg 240 ctgttctcca aattatgcac aggttataca atrgggtggt ctgttgagta gttttgaaag 300 tattagcaac gtggaacttt tatttrtbcg sttcttcaaa tcgaaatgaa aaatcaaatc 360 aatcggattg tcgtataa 378 7 83 DNA Drosophila melanogaster 7 gccaagctca gamcttamcc ctcactaaag ggactagtcc tgcaggttta aacgaattcg 60 cccttcggaa ttcggttttc ccc 83 8 114 DNA Drosophila melanogaster 8 taaagggact agtcctgcag gtttaaacga attcgccctt cgtgaattcg gatcgcattg 60 tgtgtgtcta tctaatkgta actgtaattg taattgtaaa tggtgtctac aaat 114 9 639 DNA Drosophila melanogaster misc_feature (1)...(639) n = A,T,C or G 9 cgacggcagt gvaattgtaa tacgrctcac tatagggcga attgaattta gcggccgcga 60 attcgccctt cgtgaattcg gatcgcattg atctgttgtt cgaagaatat caaattttcg 120 actttttttt tagatcattt gaacattttt tcttgataac gtgggactgc ctacttttat 180 ttcattcata aatttgttaa acatggggaa aamaaaannn ccgaattccg aagggcgaat 240 tcgtttaaac ctgcaggact agtcccttta gtgagggtta attctgagct tggcgtaatc 300 atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 360 rgccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat 420 tgcgttgcgc tcactgcccg ctttccagtc gggraacctg tcgtgccagc tgcattaatg 480 avtcggccaa cgcgcggggg agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc 540 tcactgactc gctgcgctcg gtcgttcggc tgccgcgagc ggtatcagct cactcaargg 600 cggtaatccg gktatcccca gaaycagggg gtaccgcrg 639 10 594 DNA Drosophila melanogaster 10 caagctcaga ttaacctcac taaagggact agtcctgcag gtttaaacga attcgccctt 60 cgtgsaattc ggcatctcka ctggmcaata ggcgtctacc ttgaagvatc tctcttgtgt 120 cttcgctttc catcctcccc gaagraacaa aaacttttcc ccacatgatt tatgcaattc 180 aattcataaa ggttgctttc tttctctttc tcagtttttc actacgcatt tgtacttgca 240 gccacggcca caatttgtgc acggccccgc cttcttctgt cattttttaa gtaattctgt 300 tctactctgc tggccgggta cgaacaacaa aaaacttttc cccaaactgt cataaatttt 360 tbbtcggaat gctgcacaaa actttgagtc gcctggggaa aaagttttcc cagcttataa 420 atccatgtaa aagttttgca tacacacgcg caatttctca acttttgctc ttcccataca 480 cggaaagtac atagcmacam acaattatga gagcaaaaac aaatttggca atcaagttca 540 catgtcttca taagrvtccc cttttctccc cgaaaaamaa aaaccgaatt ccga 594 11 382 DNA Drosophila melanogaster 11 actggcggct gtgatgcact cgttcctcct cctagtcttg aagctcctgc tgcacctggt 60 acatctgcgt cctttgtggg ctttggattt ctttggcttg gtggtttcga tggagcacca 120 cgtgctgttg gctcgccacc ttttgcggcg acaccagttc ctgttttggt ctggagcgcc 180 ttgtttcgat ttactggata acggcggcct ccgcccaaga attttgacat ttccgtagac 240 attgcgttta cgccgaaagt tttgtaatag ggtctcttgg tcgaacggga atttcaattg 300 abctcaaaat ttagtttttg tacttttkct caatggttga aaacgatttt gttgaatttt 360 tatgcaatav ttbvvcataa tg 382 12 490 DNA Drosophila melanogaster misc_feature (1)...(490) n = A,T,C or G 12 aaagggacta gtctgcaggt ttaaacgaat tcgcccttcg gaattcggtt tttttttttt 60 ctttttttgt attgcgtttt ctttggcgtg ttttgtggac atgccagact gcccgtcgcc 120 agctggatgc aattccacaa ctacacttcg caactcgtga tttttggcca aatgcgaaaa 180 ggtcgggcac atgtgcgcgg aaataattac gagtggaggc tgcagccgga aatgtgaatg 240 ggtgtgcgat aatgcagcac tatcgaatgc cacagggaaa cccgctttcc atcgccgcag 300 attaccgaac ccacagagca ggcatgcaaa cagcatttgc gaacaacaga attggtgtca 360 atgaacctaa gcgaatgcaa atvtgcattt ggrcgaggag aaagctctct aagaacgsac 420 aagtcctggn ctgcaaagca accctcgcaa taaataattc gcatgtacac aagtgcaggr 480 ggggcagtcc 490 13 228 DNA Drosophila melanogaster 13 atgcgggcac atggctctac grtcagccaa gtggctgtac atgatgattg abcaccttac 60 tcattacaag cactaggaaa tgcgtactta aactattcag tacaatttgg agaccaaatt 120 taagcctttg tttttaaaat tmatcatgaa tamyytcstg tgagttttsa atgcttaavt 180 atgttgatga gagcattggg ttgttaatac tcaactaaat tcatataa 228 14 549 DNA Drosophila melanogaster misc_feature (1)...(549) n = A,T,C or G 14 aaagggacta gtctgcaggt ttaaacgaat tcgcccttcg tgaattctcg atacaggcag 60 actaagamaa acaataaaca cgacaagaaa aaccaaacaa ctcgagtgct attaaacgac 120 aaaacataac avaacaacaa gaaattaata aattaavaba cccccccaaa aataaaaaca 180 aagagaaaag gaaagamaaa mmmmmmmccg aattccgaag ggcgaattcg cggccgctaa 240 attcaattcg ccctatagtg agtcgtatta caattcactg gccgtcgttt tacaacgtcg 300 tgactgggaa aaccctggcg ttacccaact taatcgcctt gcagcacatc cccctttcgc 360 cagctggcgt aatagcgaag aggcccgcac cgatcgccct tcccaacagt tgcgcagcct 420 atacgtacgg cagtttaagg tttacaccta taavvgsgag agccgttatc gtctgtttgt 480 ggatgtacag agtgatattn ttgacacgcc ggggcgacgg atggtgatcc ccctgggcag 540 tgcacgtct 549 15 376 DNA Drosophila melanogaster 15 aggttaacga atcgccttcg tgaatygtcg atacagggcg cagattgttt gctttatgat 60 tgccaactga aaaaatgaat gtactaacct tagtctatac tcctgcaagt ccaactaatc 120 ctagcttagt ttcaacgctg ccacattcca atctaatcca ttataatcta ttctvatgsc 180 aatccaatcc aatccgaacc gaaccaatcg agtctgcagc gaagtcggct ctcttactac 240 taatcccatt attgssssgg rrrckcttga ttttcgtaaa gctctctgta ccaggtagtc 300 gtaatcgtaa ttgabcgttg caagtgcttc gggagtttga tttttgaaaa caatattkbb 360 acattcatgt gtaact 376 16 256 DNA Drosophila melanogaster misc_feature (1)...(256) n = A,T,C or G 16 taaagggact agtctgcagg tttaaacgaa ttcgcccttc gtgaattcga aactccgtcc 60 tgcataaata aaattaaata ahagaataan atmccygcga aacgtataga matatcatav 120 cgtaaagmcg atttacacga aamttaaatg caaaacavac aaacaaaata cgagamagca 180 agaacaacaa aattataatg vvatcgcatg tstttttstc gtagtctcat cattgaavaa 240 ccaagtgaaa gtaaat 256 17 536 DNA Drosophila melanogaster misc_feature (1)...(536) n = A,T,C or G 17 ttcgtgaaty gtcggtcata gggagaggag aggacagcag caaagcaaag gaaaccaaca 60 tctttaatat ttgatagttt attaagccca tccattcata gtcgtcataa acattaagta 120 cataanvgan aaacccaaca aaagattcaa tataaatctc actgaaaaam aaaaaaccga 180 attccgaagg gcgaattcgc ggccgctaaa ttcaattcgc cctatagtga gtcgtattac 240 aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt tacccaactt 300 aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga ggcccgcacc 360 gatcggcctt ccaacagttg cgcagcctat acgtacggca gtttaaggtt tacacctata 420 avggagagag ccgttatcgt ctgtttgtgg atgtacagag tgatattatt gacacgccgg 480 ggcgacggat ggtgatcccc ctkggcagtg cacgtctkct gtcagataaa gkctyc 536 18 330 DNA Drosophila melanogaster 18 aaagggacta gtctgcaggt ttaaacgaat tcgcccttcg tgaattcggg tactaaggaa 60 cttgbaaact tggaactata tatatacaca tgtaatttgc acacacactc atgcactcat 120 gctgcctact ttcgggccga aaaaaaaagg aataacaaaa gatacaaaaa atcagaggcg 180 tattaaatgt atttmattag gatcgcattg tgtgtgtcta tctaattgta actgtaattg 240 taattgtaaa tggtgtctac aaattaacaa gcaaataata actataacta taactatata 300 bgaasbatac aggaattcag taaaatthhg 330 19 559 DNA Drosophila melanogaster 19 taaagggcta gyctcaggtt aambgaattg cctttkgaat tggttttttt tttygaattt 60 gcaasgaact gaaatatagg tttttgtata ttcatttgaa atttaattgt actttgcaca 120 aggcaactgc aaatataaag tatttgcacg tttttatgtg acgttgtagg caatataaac 180 aagctggtaa atatgcacgt tttgcgcaat tacgtcaaat aagttctagt gcaatttcac 240 acctcatttt gttgaaatgc ctaagctagc tttatataaa cacagcgaag tgcattgaaa 300 aatgccttag tacccgaatt cacgrwgggc gaattcgcgg ccgctaaatt caattcgccc 360 tatagtgagt cgtattacaa ttcactggcc gtcgttttac aacgtcgtga ctgggaaaac 420 ccyggcgtta cccaacttaa tcgccttgca gcacatcccc ctttcgccag ctggcgtaat 480 agcgragagg cccgmaccga tcggccttcc caacagttgc gcagctatac gtacggcagt 540 ttaaggttta cacctataa 559 20 539 DNA Drosophila melanogaster 20 ttcggaattc ggtttttttt ttttcattcg tgcacttcta tttcttgtga cgagttttta 60 atcgaaaact tatcgctgag agcagcgcct gcgctgcccg attaattttc aatttaaatt 120 tacatttgtw tgcacagcac ggcacagcaa gtttggccat aaattggcat gaccagttga 180 cgttgtcgct cggcagttat cgaaattgct taacggcgat ggccgttccg tcgggattgc 240 cacccacagc gagataagca tgcaaagtgt gtgtccttag tacccgaatt cacgaagggc 300 gaattcgcgg ccgctaaatt caattcgccc tatagtgagt cgtattacaa ttcactggcc 360 gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa tcgccttgca 420 gcacatcccc tttcgccagc tggcgtaata gcgaagaggc cgmaccgatc gcccttccaa 480 cagttgcgca gcctatacgt acggcagttt aaggtttaca cctataavvv gvgagagcc 539 21 464 DNA Drosophila melanogaster 21 ttcggaattc ggtttttttt ttttcaattt taatacattt atattcaaca gttcggtaga 60 tctcttcaaa gtttacttcg ctttatgcaa acattcaagc gaaatgcact ttgtacagat 120 ttgttgtttg ttgtttttaa tagaaagagt ctaaaaatat tcgcttaggt acgaattcac 180 gaagggcgaa ttcgcggccg ctaaattcaa ttcgccctat agtgagtcgt attacaattc 240 actggccgtc gttttacaac gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg 300 ccttgcagca catccccctt tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg 360 gccttcccaa cagttgcgca gcctatacgt acggcagttt aaggtttaca cctatavvvs 420 gsgagagccg ttatcgtctg tttgtggatg tacagagtga tatt 464 22 436 DNA Drosophila melanogaster 22 acgcaagcta gactaccctc actaaaggga ctagtctgca ggtttaaacg aattcgccct 60 tcggaattcg gttttttttt tttcctttgt gggataacca gggcctctac tccactcata 120 atggagctgc tgggcaacaa caactgacac ttttgcctaa tgctaatgca gttataatga 180 gctctaagcc aattggattg gatgtccctg atttgggttg gtcttgggcc caaagtcgaa 240 gtcaaagtca gaagacgctt cggacatttg cataattvav vbbgatgttc atgctgccag 300 tgaagagcgt agttgtgtac ttgccagctc tatcgraaag gctgtccgac ttgtgtcatt 360 ggttatbgat avgccagcaa ctagggcgct gattgagttg ccaaggtgag gagcscataa 420 atkgcattat agggcg 436 23 429 DNA Drosophila melanogaster 23 gggactagtc tgcaggttaa acgaattcgc cttcggaatt cggttttttt ttttcttttt 60 accgttaacg tccgagcgct gctgctctgt acgcgcgcgt ttgtgcgtgt ggtgtgtgtg 120 ctacgsgcgt gaatgtgtgt gcatggcgtt taaaagggat ggcgggggaa gaggatgcgg 180 agtttttgga gttttgcagc cgactaaaat tctcggcaat ggcaatgtca attcgttttt 240 cgagacgcgt ctatcgttgc aatttatgta tttbaacaga aaattgattt tcctctgcgg 300 cctgggctga tttcagctyc ctttttagtc ctgcgtgggc ttctgaatcc tgggsggcgc 360 ctcagtgttg atttacttca ggttggcttt atgtcggttt tctawtttgg gcatacacaa 420 ycaaggaca 429 24 505 DNA Drosophila melanogaster 24 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgcccttcgt gaattcggtt 60 ttcgcagacg gacggagcaa agatttgtgc atcgaaaatt gcaattttca catttggagc 120 cgtgtgtkca gcatcacaga cattggaaaa gcgcttgtcg cacttgtgca aaaacacagc 180 ttgataaagc aaacgcggct agaaggcgat tggccgcaaa ttcgccctgg aactggaacg 240 caattgcgtg tgaaaaaaag tcggaaaaaa agtaaaagag cagagactga aaccgtcgag 300 aaatcgttgc cgtcgcgctc gctataattg tttgcttttg acagagagat tcgtgtgcct 360 bttggcagaa acatttaata atagtktgta ttttcggcgt gagtscgcag tgaagcatcg 420 aagtagttca agatgagcgg gggattaccs sagttgggct ccaaaatcag cctcatttcc 480 aaggcggaca ttcgatacga gggcc 505 25 509 DNA Drosophila melanogaster misc_feature (1)...(509) n = A,T,C or G 25 tcattrrggc gatgattagc ggcggaatcg ccttcggatt cggttttttt ttcttctctc 60 tttgcttttg ctttcctttt aaattcgaag atttcttttc tcctttcagc gcggccccgt 120 tacttctttc tctcctgttc caccgcctyc gtcgartgcc ctgmatmagt tcttgcaggc 180 ggcggctttc ttgttgttgt tgttggtgtt catcagcttg gtcagaatct tgcgggaatg 240 accsgaggga tcgaagtggt tggtgtgctc attggcctcc ctcttgttgc gctgcttgcc 300 gctgtgtcag agaagatgtg cttgtcatgg tcgaagttgt agtgctcgta ctcatcgaag 360 tgggcctcgg acataatggc grtttaaggt tgcaaaacta actgcgttct tggttggtgt 420 ctgctttttg tttctctgca aacttgcagg ctgctttcaa atttctcttc tgcttttnnn 480 hgttcacttt tcgctgagct gcgaaaacc 509 26 470 DNA Drosophila melanogaster 26 acgcaagctc agaattaacc ctcactaaag ggactagtct gcaggtttaa acgaattcgc 60 ccttcgtgaa tttggatctg acacaaaaac acgcaagcst caagcgcgca aggagtaagc 120 sggcatcccc acatgccgaa tccgaaaaca aacactgcaa ctctaaagaa aaaaaaaaaa 180 ccgaattccg aagggcgaat tcgcggccgc taaattcaat tcgccctata gtgagtcgta 240 ttacaattca ctggccgtcg ttttacaacg tcgtgactgg gaaaacccbg gcgttaccca 300 acttaatcgc cttgcagcac atcccctttc gccagctggc gtaatagcga agaggcccsc 360 accgatcgcc cttccaacag ttgcgcagcc tatacgtacg gcagtttaag gtttacacct 420 ataarrggga gagccgttat cgtctgtttg tggatgtaca gagtgatatt 470 27 705 DNA Drosophila melanogaster 27 ccagtgsatt gtaatacgrc tcactatagg gcgaattgaa tttagcggcc gcgaattcgc 60 ccttcgtgaa ttcggatctg acacctacga actggccggt ctagcccgtg gtggtcagca 120 gctggccaag ctgaagaaga actaccaaag cgccgttaaa ctgctcgtgg agctagcctc 180 gctgcagacc tcattcgtca ccctggacga ggtgatcaag atcaccaacc gttgtgtgaa 240 cgccattgag catgtgatca tcccccgaat cgataggact ttggcctaca tcatctcgga 300 gctggacgag ctcgagcgtg aggagttcta ccgactgaag aagatccagg acaagaagcg 360 cgaggcacgc atcaaggccg acgccaagaa ggcggagttg ctgcagcagg gcatcgatgt 420 gcgccagcag gccaatatcc tggatgaggg cgatgacgac gtgctgttct aagtattacc 480 ttggatgtgt ttattccagc aatattattg ttaaactatb bstgttctgt atccgtttcc 540 atattgtaat tcgcaaaggg caagccacaa acttaattgt acctatgtag caaaagtgtg 600 tagactattc caataagcag aaaagtatgt attccattty ygtagtgtag tgtaagctgc 660 gcaaatcatg cattaaagkc ttgttgtatc atacggraaa vvaaa 705 28 637 DNA Drosophila melanogaster 28 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgcccttcgt gaattcggat 60 ctgacacaca cactgcaggc atacacctac cgtagatctt gagttggcag catcgtaatc 120 gtttttttag tcgacgatgg gaaagaacta gcaaaaatta gcaaaattta aaattgatta 180 tgtgacacac aacacacctt ccacacactc acacacacag atactccatg tatataagta 240 tgtatatbcc yttgagctga acattgtata accgaacgat tacgatacca caaacaaaca 300 cagcgacaca acgaccttag ccatcagaat acaaggacac accgaggtgt gtaaagggtt 360 ctttgcttac aacttttcaa acgaattaaa ccgtttcggc caacctttag gattgacaaa 420 agatcgactt gaggaacaca gccccgccca ttctgctccg aaattcaatg agcggggggg 480 cgggagaatg catcgtttcg gttgatggat gcaaacgrtc aaaaatgtat tggccccccc 540 aagggccaag tgaaaccaaa tcgagccaag aaattgattt atgttcaatt aagcaagcaa 600 aaacaacatg cgaacatttg aaaattgtga aatatag 637 29 402 DNA Drosophila melanogaster 29 cagtgaattg taatacgact cactataggg cgaattgaat ttagcggccg cgaattcgcc 60 cttcggaatt cggttttttt tttttctgaa tttggtacaa tgtgttacaa atttgttaca 120 aagtgttgat agaaagtaat ctgagttcac agcattgaat tcaaattaac tatcctagtg 180 atgcatcttc tacacaaacg ttctttaatt gtaaabbbct gctgtatgtt ctaacgatgg 240 cttttccaat gttattgcgt acagatactt ttcacagata gtccttggca tcaccagtgc 300 agcggcacaa tarttgccac atgcaccaca gaattgagtg cagttgttgg gcttatcacc 360 gcatgaccac catggcacat cactcatacg ccatgtttgt cc 402 30 460 DNA Drosophila melanogaster 30 ctcactatag ggcgattgaa tttagcggcc gcgaattcgc ccttcgtgaa ttcggatctg 60 acacattgga tattgagtta taaatgataa agttgttaaa attcgatata atbcctyysa 120 ggaattatgt taaaybctac aaggcttaaa aggtccatcc tttttatagt ttgtcaaact 180 cagtgaactt aaaataacat gctcttttat atgtatagac cgtatttbbc aattcatact 240 attgattcac tgaacatgtc gtttacaaat gtattatgat aaavgcgcct taacgctatt 300 tatgtataac cggtataacc aaacactata ttavvbscst bacgttagtt ttagtaaatg 360 gaaactgtta gactgtgacg aaacaaatat atgtgtcaac acttacttyc aagaactaca 420 tagacacaaa caaaatatac cgatgagaaa aammgaaaaa 460 31 207 DNA Drosophila melanogaster 31 ctcactatag ggcgaattga atttamgcgg ccgcgaattc gcccttcgga acttcggttt 60 tyyyytttyc gasggacacg gaaggcatgc aattttstgc agtgtaggtg tsagtstctt 120 cttaaattca aatcgtttcc aacaataaat ctatgcaaaa gcaatttgca aattgcacaa 180 gtaaaataca aatggggact tctatat 207 32 349 DNA Drosophila melanogaster 32 agtsaattgt atacgactca ctatagggcg aattgaattt agcggccgcg aattcgccct 60 tcgtgaattc ggatccagta ctaccagaac cagataccct actaaaccca agttgcaggc 120 gcctcgcgtt cgttgttgtt tggccaaaag acacttaava caaaavavaa aawtacactc 180 cccatttagt cccaactgga aatcagtaca cccccgtact cgaatccttc tatccgtatt 240 atcaatgttc cgttgtcgta ttccttaacc cgtattttty ccgagtgtgt ttttcgtttt 300 accttcgcgt acttataaga bgttgaactt tttccaagrt tttacgrcg 349 33 534 DNA Drosophila melanogaster 33 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgcccttcgg aattcggttt 60 tttttttttc tttttttttt ctttcgtttg actttagcac acgsacacaa aattccaacc 120 gaaaacgttc aagacaaaaa gcgagcaaca actgagaaga agaatacgat attttggata 180 attttgatta gggggaaatt gattaggtaa ccaccgatgc ttattttkct tctttgtttg 240 ttgctaactt tatcacccaa tccctctttt tgataaaggg tgggatatgg tatactagcc 300 ttaagsccag ttcataacag aaaaatacac gcaagagaga gacagcgaca gaaaaggaga 360 gcgagatgaa gcgtctgaga tccsaattca cgaagggcga attcgtttaa accygcagga 420 ctagtccctt agtgagggtt aattctgagc ttkggcgtaa tcatggtcat agctgtttcc 480 cgtgtgaaat tgttatccgc tcacaattcc acacaacata cgggccggra gcat 534 34 257 DNA Drosophila melanogaster 34 aggtttaaac gaattcgcct tcggaattcg gttttttttt ttcaatttag ctcttttgcc 60 tcttgcttac tgtggttttt tagtagatgt ttttttttta acagctaaat acatgttcgt 120 tatatcatac tatttgagtt tcgtgtttgg tttcttccgg tgtactcctt tctctcacct 180 attcgtgtgg tttgtttttg catggctgcg ttaaaattct cttactaact agatttagct 240 ttttkttcta aattata 257 35 556 DNA Drosophila melanogaster 35 ctcactatag ggcgaattga atttagcggc cgcgaattcg sccttcgtga attcgttttg 60 gctccacgtt tttgaaacaa tgcgtgcttt tgcttgtctt gtttgcgttg ttacattttt 120 kattgccaac gaaatattta tgtgcaggcg aaatcaaagg aagggaagta aagamaggaa 180 aattgaggcg taaaggacga agggacccct attcgtttgt tttgttgtgt tttgtttgat 240 ttatgcgtaa acttttctgt tgccgctttt atgtacgaat tgtgtttccc tcgccaccgt 300 tttacatgct taavagaacg aaaaavaaaa aaccgaattc cgaagggcga attcgtttaa 360 acctgcagga ctagtccctt tagtgagggt taattctgag cttggcgtaa tcatggtcat 420 agctgtttcc ygtgtgaaat tgttatccgc tcacaattcc acacaacata cgggccggaa 480 gcataavgtg taaagcctgg ggtgcctaat grgtgagcta actcacatta atbgcgttgc 540 gctcactgcc cgsytt 556 36 95 DNA Drosophila melanogaster 36 ataccatata cgatataatt htccacgtat aavtvttcat cmcattcatc atccgaacca 60 ttcatycatt catactattc atccatccga ttcat 95 37 464 DNA Drosophila melanogaster misc_feature (1)...(464) n = A,T,C or G 37 ctactatagg gcgattgatt agcggcgcga atcgccttcg gaatcggttc ccctyyycga 60 ccattacaga gctatagttt acattacact accttcgatt agaattgctg ggactatatg 120 cgttcagaat aatgattttc attgcccaaa tgaagaggag cattagtttt aaagaggatt 180 tyacgtggtg atcacaacgc ttaaggacac ttgcagccat tcaagtgcct tgcggcaata 240 tccttacgat ccaatacgat ctggtcaagt gacgaaacaa tgtcggtaat attbvbssgg 300 ctatgvnccg aattcacgaa gggcgaattc gtttaaacct gcaggactag tccctttagt 360 gagggttaat tctgagcttg gcgtaatcat ggtcatagct gtttccygtg tgamatygtt 420 atccgctcac aattccacac aacatacggg ccggragcat aacg 464 38 201 DNA Drosophila melanogaster 38 agatccgaat tcacgaaggg cgaattcgcc ttcgtgaatt cggatctaac cgggatcaat 60 cttccagcga cagttctatc aacagaragt actaaatgav acaactcttg aktccaagct 120 aaamvvacag aattttgaag acgaattgcg atcagatgaa atatgggcat taggcaacca 180 aataggccgt agtaggaatt t 201 39 217 DNA Drosophila melanogaster 39 aggttaaacg aattcgccta ctatcgcttc ttgacgagtt cttctgaatt attaacgctt 60 acaatttsct gatgcggtat tttctcctta cgsatctgtg cggtatttca caccgcatca 120 ggtggcactt ttcggggaaa tgtgcgcgga acccctattt gtttatttbb ctaaatacat 180 tcaaatatgt vtccgttcat grgattatca aarvggg 217 40 141 DNA Drosophila melanogaster 40 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgccctytcg gaattcggtt 60 tttttttttt cgttttagtt tcgtttttgt ttttctttca ataaacacat cacatcatat 120 tcacaattaa taatcataat g 141 41 303 DNA Drosophila melanogaster 41 gacggcagtg aattgtaata cgrctcacta tagggcgaat tgaatttagc ggccgcgaat 60 tcgcccttcg gaattcggtt tthhhttttt catttactgc gtactaaata agavgcgttt 120 taagtgacta ataaaagava ttsgattagt ttaaatvtct caatttggca tatgtmatgv 180 vaacttagct ttaataaavc bstggcacac tgtgtggcac caatgctatc taaaaggaaa 240 tamatgtgaa tamacamagc aaaatmvvtb gacctaccaa ctacaacatt ttaactggta 300 aaa 303 42 425 DNA Drosophila melanogaster 42 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgcccttcgg aattcggttt 60 tttttttttc ccaaccaaaa cttggggtgt cataaagttt ggccccgacc aaagtttgat 120 tgaattgtgt gcggggactt ggatgggaaa aatctgccgg aataaaarrg rrgrkcccga 180 aatgaagatg ggttgaaatt gaggctggca ttaggtattg aatgtttata aatcaaaccc 240 gcagtcatgg ctttttataa agvtttycgg aaacgaatgc tgaagtttta gtatttcaag 300 agatatttct tcaaagattt cagcttttyc cttttttaaa tmtatagsyy tcycttaggt 360 ttcggaaatg tgaaactgat tcaaaactaa aatgaatcca ttcaacgtaa taaaacatgg 420 taatt 425 43 174 DNA Drosophila melanogaster 43 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgcccttcgg aattcggttt 60 tttttttttc aaggttttcc agttgttctt tttttttttt ttactcgcgt ktkcctaagt 120 tactaggaaa atgtatttta ttctgcgtgt gtgtattcaa taatcgttta taat 174 44 673 DNA Drosophila melanogaster misc_feature (1)...(673) n = A,T,C or G 44 gactcactat agggcgaatt gaatttagcg gccgcgaatt cgcccttcgt gaattcgctt 60 tctaccgggt caaatgtacg atgaaacgaa aaacagacaa ggcgacgatg gcagcattgg 120 gccaattggc agcacccgaa tgccggatgc ggtaacccat catacgtgca tcaaatcatc 180 aactgaatat gaattaggtt taatcttaaa ggaaattcgc tttataactg atcaggtaag 240 ccggcggact gctgcagatc tgcagatttg cgcagaccat tcgattgtac taaccacaat 300 tttagctctt gaattgcatt ccgaatacta tagtattact ttgcatgcac tgcctagttt 360 tagcactcaa gaaaaaatga cttaaatcaa gtttaactct tcgcttcatc tctttctctc 420 caatccaacc aactatttcc tgtgataact gacataatca atctaatcaa aaatcaaaca 480 gctacgtaaa gatgacgagt gcaatgacat tgccaatgat tggaaatttg cagctatggt 540 cgttgacaga ctgtgcctta tcatattcac aatgttcaca atattrggca caatagctgt 600 actactatca gcaccacatg ttgttgtctc gtagccatat gggcgaggtg gttattgtaa 660 tnggktttst yat 673 45 544 DNA Drosophila melanogaster 45 ttcgccttcg tgattcgtgg taagggagac agccgcacaa agatgacaaa ccgatgcgct 60 tgcttcgcct tggccttcct yctcttctgc ctcttggcca tttcgagcat cgaggcagct 120 ccgatgccca ggtaccaatc caatggagga tacggtggcg ctggctataa cgaactggag 180 gaggtgcccg acgacctact catggaactg atgactcgct ttggacgcac catcatacgg 240 gctcgaaacg atctggagaa ttccaaacga accgtggact ttggcttggc ccggggatat 300 tcggggtacc caggaggcga aacatcgcat gggtctggct gcagccaact ttgccggagg 360 acccggtcgg aggcgacgat ccgagaccga tgtctaaggc gtctggatga gaccgggctg 420 cgaatcctgt gaacgttgag gacgagcgtc aactttttcc acctaaactt gggccatact 480 ttgatacttt cctagakctt tcatttgcat tacggttaaa tggtykgtba ttcatgtaca 540 taag 544 46 88 DNA Drosophila melanogaster 46 acaagggaca gccgmaggca acgaacgccy cggaavcgga caagcggaac gggaccaaca 60 acaaagcgaa agaamccgaa agaaaarg 88 47 2873 DNA Drosophila melanogaster 47 atgcagagtc tgcggatctc gggatgcacg cccagcggta cgggtggctc ggccacgccc 60 tcgccggtgg gcctggtgga tccgaatttc atagtcagca actatgtggc cgcctcgccg 120 caggaggagc gcttcattca gatcattcag gccaaggagc tcaagataca ggaaatgcaa 180 agggccctcc agttcaagga caacgaaata gccgagctaa agtcgcactt ggacaaattc 240 cagagtgtct ttcccttcag ccgtggcagt gcggctggtt gtgcgggcac gggcggagcg 300 tcgggatctg gagccggcgg aagtggcggc agtggtcccg gcaccgccac aggtgccaca 360 cgcaagtcgg gtcagaattt ccagaggcag agggcattgg gtatctccgc cgagccacag 420 agcgagtcct cgctgctcct cgagcacgtc agctttccca aatacgacaa ggatgagcgc 480 tcccgtgaac ttatcaaggc tgccatattg gataatgatt tcatgaagaa tctggatctg 540 acgcagatcc gcgagatcgt tgactgcatg tatccggtta agtatccagc caagaatctg 600 atcatcaagg agggagatgt cggaagcatc gtttatgtca tggaagatgg acgcgtcgag 660 gtttcccgcg agggcaagta cctctccaca ttgtcgggag cgaaggtcct tggcgaattg 720 gcgatcctgt acaactgcca gcggacggcg accatcaccg cgatcaccga gtgcaacctg 780 tgggccatcg agcgccagtg cttccagacc atcatgatgc gaacgggcct gatccggcag 840 gcggagtaca gcgatttcct caagagtgtg cccatcttca aagacctggc ggaagacacg 900 ctcatcaaaa tctccgatgt cttggaggag acgcactacc agcgtggcga ccacatagtg 960 cgccagggcg cccgaggcga taccttcttc atcatctcca agggaaaagt gcgagtgacg 1020 atcaagcagc aggacaggca ggaggagaag ttcattcgca tgctgggcaa aggggatttc 1080 tttggagaga aggctctcca gggcgatgat ctgcgcacgg cgaatattat ttgcgagtcc 1140 gccgatggcg tcagttgtct ggtcatcgat cgcgagacct tcaatcagct aatttccaat 1200 ctagacgaga tcaagcatcg ctacgacgac gagggcgcca tggaacgcag aaagatcaac 1260 gaggaattcc gggacattaa cctcacagat ctgcgcgtca tcgcaaccct tggagttgga 1320 ggcttcggtc gcgtagagct ggtccaaact aatggagata gctccaggtc cttcgcgctc 1380 aagcagatga aaaagtcaca gattgtggag acgcgtcagc agcaacacat catgtccgag 1440 aaggagatca tgggcgaggc caattgccag ttcatcgtga agctgttcaa gaccttcaag 1500 gacaagaagt acctgtacat gctgatggag agttgcctgg gcggagagct ctggacgatt 1560 ctacgggaca agggcaactt cgacgacagc accacccgct tctacacggc atgtgtggtg 1620 gaggcctttg attacttgca ctcgcgtaac atcatctacc gcgatcttaa gccggagaac 1680 ctgctgctca atgaacgagg atatgggaag ctggtggact ttggctttgc caagaagctg 1740 cagacgggca ggaagacctg gactttctgc ggcactccag agtacgtggc tcccgaggtg 1800 attctcaacc ggggccacga catcagtgcg gattactggt cgctgggagt gctcatgttc 1860 gagttactta ctggtacccc tccattcacg ggctcggatc ccatgcgcac ctacaacatt 1920 atacttaagg gcatcgacgc catcgaattc ccaaggaata tcacccgcaa tgccagcaac 1980 ctgatcaaga agctctgtcg cgacaatcca gccgagcgtt tgggctacca gcgtggggga 2040 atcagcgaga tccaaaagca caaatggttc gatggcttct attggtgggg cctgcagaac 2100 tgcacactgg aaccgcccat taagcccgcc gtgaaaagcg ttgtggatac aacaaacttt 2160 gatgactatc ctcccgatcc tgagggtccg ccgccagatg atgtcactgg atgggacaag 2220 gacttctgag gagaatcaga acccgtttcc tagacgatgc tctctaaacg cttctgctgc 2280 agaaaaccag gaggatatga aagccaggga agaaaaattg atcttaagtg cgccatatgt 2340 acgccaaagc caacagcaac agtcagcagc tcgcatcgaa aagctgccat aaaaaaaaac 2400 aaagaaacgt agcagtcgca aggtcaaggg ccgacacaaa agcacaatca tccatcgtcg 2460 tagctccatt tgagatttat agatacgtct ccgtgatgtt ataaccatga tgtgcaacgc 2520 aatgaatcta ttaacgagtt tataactatt attttataat gaggatatat gtgtctagtt 2580 cgcttggaat tgatgtaaat tgtaagtagg tctgtgactc tgtttcagag ctctgttagc 2640 catgtgcatt gtataaattc agctatttgt atctattaaa tatttttaac ataattatta 2700 cacatcattg ttaaagcata caaatcgggt tgccttatag tctgtaagag aacatttgaa 2760 agcaacattt gaccaagatc ttccgtcaca catttcttaa aattctatgt ggcctctcta 2820 ctgtctttca ttagtcttag cgatcatgtc tattatatgt acgataacat gcc 2873 48 1357 DNA Drosophila melanogaster 48 caggaataaa attagaagaa accataaaag tgcggaaatg aaaatggagc tcaaataggc 60 gacagacggc agctggtgga acaggatgag cggcgtggag gagcaggcga gtctgctgcg 120 cgtactgatc aactgtgcgg agaaggcggc caacatagcc aggatatgtc gctccaacga 180 gcagctcctg gcattgctgg tgcaggagaa gatcggatcc gaggccaatg agcgcttcga 240 gcacgacttc aagaccctgg ccgatgtgct catccaggag acgatcaagc acgaggtggg 300 tgcgctcttt cctgccatga aggatgccat cctgggcgag gaatcggcca atttcaccaa 360 ccaactgggc gagagtgtca ccattgcagt gggtgccact gaggaggaca ccgccgcctc 420 tctgcaggca gtactcagcg ggcacgagga tgcggccagt gctctggcca ccgaagtcca 480 tcgggatgtc agcttcagca gcgaaaagct gggcgagatt gcccaactgc cggacgaact 540 ggattacggc aatttgggca tctggatcga tccaattgac gctaccgccg agtacatttc 600 gggtgacacc atgttcaccg actttcccgg catcacgtcc actggcctgg actgcgtcac 660 cgtgctaatt ggcgtctacg aacgggacac cggcgtaccg gtgatgggag tggtggctca 720 accgttcggg gagaagctgg aggagaacgt gtacagctcc tcgatgttct ggggcgtctg 780 cctgccgacg gtaagggcac acaactgcga ctttgaggcg cgcgatgaga accgccgttt 840 gggcatcttc tccagctcag aacagtccga tatccttcag cgtttcctcg atctgggcta 900 cgagtttgca ttctccgccg gtgctggaca taaggccttg aaggtgatta cccacgaggt 960 ggatgtgtat ctgcttagca aaggctccac cttcaagtgg gacacgtgtg ctccgcaagc 1020 catactccgt gcgctcggcg gagatgtgct ggactatgcg gccagtgtgg cggaccaaaa 1080 ggcagtgcca ttgaagtacc tgattgagga tgctgaagcc gatgccgatt ggaagcgaaa 1140 cgcgggtggt ataatctctg tgcgcaatgt cgatgttgtg gatgagctgc tggccaaatt 1200 ggccgagcaa tgagtgttgt tattatgtta tttgttgtaa acccattatt gctgttgtaa 1260 ttgatttgat gtaattgcac tacgagtcgt cagtcgaatt tgtaggcata gaaactacta 1320 caagaaaatt tgctaataaa tgaaagcggc gaacgtg 1357 49 1533 DNA Drosophila melanogaster 49 aacatcctct ttaactgtgt tgtcgttatc aatatgattg tggccttctt ctatcccttt 60 gataatactg tgccagagct cagttcccat atctccttgc tgttctggat cataacaata 120 ttctcacttg ttattgtgtt agcgctacct cgggagtctg gaattagaac cttcataggc 180 tccgtcatcc ttcgattcat ctttttacta gggccggagt ctactctctg cttactggga 240 gtcatcacgg tgaccctcaa aagcgttcac atagtgagca tcatgggcaa caagggcact 300 ctggagaagc agctaattaa aattataact gactttcagt tactttacca ctgcatttat 360 atagcgtttt gcttctgcgg tctcatattt catccctttt tctactcatt gctgcttttt 420 catgtggtct atcgggagga gaccctggtc aatgttattc gatccgtgac ccgcaatggt 480 cgctctatcg tgttgacagc tgtcctggct ctgattctgg tctatctatt ctcgatcatt 540 ggctacatgt tctttaagga tgactttctg gtcagcgtgg actttgagga gcaagacaat 600 gcgccgcctc cgtctgtccc tttgactcta tctgtaccgg tatcgggaga ttcctgttct 660 gctcccgatg atctgggcaa ctgccaggca gccaaggaag tagcaccacc gagtgctggt 720 ggaggtgaag tgaaggagcg ttcctgcgac tcgctggtta tgtgcatagt aaccacactc 780 aaccagggtt tgcgaaatgg cggcggtatt ggtgacatac tcagggcgcc ttcaagcaag 840 gagggcttgt ttgtggcacg tgtaatttac gacctgctat tcttcttcat tgtcattatc 900 attgtcctta accttatatt tggtgtcatc atcgacacct tcgccgatct gagatcggaa 960 aagcagcaaa aggaggccat tctgaagacg acctgtttca tctgctcact caatcgatcg 1020 gccttcgaca acaagaccgt gagcttcgag gagcacatca agagcgagca caacatgtgg 1080 cactatctgt acttcatcgt ccttgtaaaa gtgaaagacc ccactgaatt cactggcccc 1140 gaaagctatg tatatgcgat ggtcaaggct ggaatcctgg agtggttccc gcgcctgcgt 1200 gccatgtcgc tagcggctgt ggacgccgat ggggagcaaa tcgagctgcg cagcatgcag 1260 gcacagcttt tggacaccca gctgctcatt aagaacctgt ccacccaagt acacgaacta 1320 aaggatcaca tgacagagca gcggaagcag aagcagcgat tgggattgct caacacaacg 1380 gccaacagcc tcctgccgtt tcagtgagtc tgtaagaaat agaatagaat actaattttg 1440 gcacataatt ttggtgaaca ttccatgtga acccaaaggt attggccccc ttcgccacat 1500 ttatcctatc aaaaaccatt ctcaatatat caa 1533 50 10519 DNA Drosophila melanogaster 50 gtcgacccgc agatggacac cacctggcag tgcctgcgcc aacccattgg caccgacggc 60 gtccacgacc gtgacaacgt tcatccaggc aatcattcga aggagacgtt gtcgcaggca 120 gtcaagctgc tcaagggcga atactgtgtg cacgctatag ataaacacaa tatggaccga 180 gccattatct tctgccgcac caagcaagac tgtgataact tagaaagatt tctccgccag 240 cgaggcggaa aacattactc gtgcgtctgc cttcatggtg accgaaagcc gcaggagcgt 300 aaagaaaacc tggaaatgtt taagcggcaa caagtcaagt ttctgatttg cacggacgtg 360 gcggctcgtg gtctagacat tacggggctg ccgtttagta agttctcaaa ttttgaaaca 420 ataaccgtcg ttaaaccatt ttccaaataa tttacttcct tatcctcagt gataaatgtg 480 acgctgcccg atgacaagac caattacgtt catcgcatcg gacgtgtggg cagggcggaa 540 cgcatggggt tggccattag tttggtggca acagtgccgg agaaggtttg gtaccacggc 600 gagtggtgca agtcgcgcgg ccgtagctgc aacaacacaa acttaaccga agtccgagga 660 tgctgcatct ggtacaacga accaaaccta ctggccgaag ttgaggacca cttgaacatc 720 accatccagc aggtggacaa gaccatggat gtgccagtta atgacttcga cggcaaggtc 780 gtgtacggtc aaaagaattt aagaactggc agcggctacg aagaccacgt cgagcagctg 840 gtgccaactg tacgcaaact gacggagcta gagttgcaat cacaatcttt attcttgaaa 900 cgtcttaagg tctaacgcgg atctggagcg acatttgttt gctatgcaag gaaaggatgt 960 ggtgaattga gaagtgcaag accaactcac gcggaaatac ttttgcggca caaacacaca 1020 attttgtata tgtaggaagt cggagctcaa gctctgctat taaaaataaa catttaatta 1080 aactaacatt aaaatcgtct gtctaaagag taaacagcct cgtttgaaag ctggagtccg 1140 gttaccgtca cttcttctcc tcctcctccg cctcctcctc ctcctcctcg tcctcctcgt 1200 tgaactcggt tcgcatctgt tgcagcgctc taagctgctg gcacagctgg ttgccgaacg 1260 actcgacgta gcgggggctg atcttgtcca gcagagcgta cggcgtttca aatatctcct 1320 cctcctcgtc gtcgccttgc tcttgtccgt tgcgaagcag tgcagagatc cggctgtcct 1380 cgtctatggt gtcgtactgg gaggagacga tcttgaagcc acgactgcgt cacctgaacg 1440 cagcaagtgg cgctctcgat agtccggatg ttcaggtaaa tttgcgtggc atcgcttgcc 1500 agtttggatg agatgcaaat ctcagcgacg tgagccttaa catcgttaat tatggcgttg 1560 gcttcgtcct cgcagttgaa ggcctgctcc tccccagggt ctgcggtggc tttcggttcg 1620 cccatcctaa cagaatcgct tcggagctct gcggctcgcc ttttaattga atcccagtcg 1680 atctgattcc cttttacagg cgcaacatgt cgttccactc tactccatat gtgtcacagc 1740 agctgtgtgg cgtgtgccac gcgttagtac attttgaccg atgaagcgga gcagggcgtt 1800 ccggccaaaa ctgaatataa tcgttattaa tgattttaca tgcaatttag cttagttttt 1860 ggtttaatag aagttcgtaa gacgtaccga tataaaacgg ttgctgtcgt atcgaaagta 1920 taaattactc ctaaataatt gaaattggaa tggaggggtg catagtatag attatggatt 1980 aagtccaaaa tttgtatata gcctcttttt tacattcttt ttacaaaaac tgtaaataat 2040 tttaatttaa ttaaataatt taaaaaaaaa acgaatagca attttttatt tattattagt 2100 tcttaatata tgctccctta ttaacttgta attgtataat taaaaaactt aatttaattt 2160 ttaaataaaa aaatattaat aaaaaaatat tctgcgaatg ggttgcatca aaaaaaaaaa 2220 acacgtgcac gaccttttcc ttgggtgttc cctttcaccc ttcatttctt agacaatctc 2280 gctttcgccc attcaaattt cttagacaat ctcgcttccg cccatacaaa ttttatgcat 2340 acagaaatga cctgcggcga tcgttgcttg tgagtcgctc accagaaaag gttattgcgc 2400 tcgtaataaa aaactgtcgg atatttttta ttcccgcaag tagagtgtca aaaaacaccc 2460 ggctgcctag atacaacagt gtttagcggg tgcacttgga gtgccagtgg cttgctgctg 2520 ttctctgtgc acactttcga tgcgatttct tacagtgagt cccgtaacag cagctattaa 2580 tgttgcgata gatggcgtgt tagtccagag ctggaaatca ttgatgtcgg caggtaaacc 2640 gataagatag aacatcggtt cagttccatc tctaatttcg ttgcgaaaaa aataagaatc 2700 gaggcatcgg tacggacgtc aaaaactgca attataagag tacacgcaca gattgtggtt 2760 cttggcaggc acagactttc actgcgtgtg acagcgatcg ctagtgcaag ttacccgttc 2820 gcagtcaaag tgacacaggc atcaggatga gcgcacctgg catggacaag agaaaactgt 2880 cgtaagtatt ttgtagtgct ccttcatttt tcctccacgg cgcacacaca cgcgccgtcg 2940 gcagagtaca cacactcagc ccgtgcgcga gtgtgagtag cagctgcgcg caggcagcgc 3000 ggtcggcgcg aaattgattt ttcgcggcag aagttttctc gatccagcag gcactttcct 3060 ttccacgtcg ggataaacaa ggaaccggtc aggggtaata atgggctccg gtagtgggaa 3120 ggggagggca gctgcggctg gagtttgatt tcggacaagc gaattcgtcc tggcaagaca 3180 atttgccatt gacggaattc gattttattg attcaataat cgttagttaa aatcttagtg 3240 gcatagtttc aacttatgtt gcattaggca ggaaaatgta aaaaggtaga aaaaaggcta 3300 cgcacatcat cgatcaaaga ctcgagccga tatcattcag atatgtgcta tgtaaataca 3360 tacagaccat tattaggaat tacacattat aactacacag tgtatggaag gaatttcctc 3420 cttcctttct cctccttcct cttgcaatct ctctcggttg gccacacctc tcttgcccgc 3480 aaatcgcacg atttcaatgt gtttaataga gaaaaaaatg agtcgatacc ttcattcatt 3540 ttataaatat tgttctattt aaattgttca atgtttgggt cttcgtgcaa cacataccga 3600 ttccggcata agtgcgcgat aacaacagac ctacagatcc aataacacga caaccaacta 3660 actaaccagc ctggccgtta tcggacagaa agttttacgg ccgcgttagc cgtctctata 3720 attgtaccct atatgagtaa tgccagcact aaccgcttca gtttggccca ggcgaattcg 3780 agttggctta gattgtcatg atgcctgcca atgtccttgg ccctgtgcag tcgcccaccg 3840 cccagtgtcc ctggggcacg catccgcttg agcctgtaat caccctcatc atcgtggtcc 3900 tcaaccactt tgtcgagcgc tcgccgcctt atcagctcgc agattctcag agcgggttta 3960 ggcatacaca gcgttgcacc ccaaccccct taacttacca agtttataaa tagcacccac 4020 ctgctcaaga aaatccatac agattgcaat ttatgttgca cgcgaccctg ctgcagagcc 4080 aacaatgccc agtttgcttt cgaactggac gccctcaatg ccaaccgcct agtccgtggg 4140 tgcaggcaag ccctgagggc caagtgctcg gccggatgag ctcccatcgc gtgccacggc 4200 taactgttgc tgcggtgagt catagtccaa tactacttgc gatggctggg ccttatcgcc 4260 ctctgtcaga atgtaacata cgcagctgaa ggaatgtcgg aggcacagtg caactgcgct 4320 aggcaggtcc acgcagtggt tcctgaggtc ttacctgtgc gtggttatca acctttaatt 4380 ccaaatgcgg agacgtggat agtaatccta cgcaatacag agaccatgtt cacttgattt 4440 tcatttattt aggacatctg gcgactcgct ttacgagatc ttaggacttc cgaagacagc 4500 caccggagat gatattaaaa agacttaccg gaaactcgca ctcaaatacc atcccgataa 4560 gaacccggac aatgtggatg cggctgataa ggtaaaatat ttccgagaaa tgggtccatg 4620 cgtattaata tattaatata atatataata taaatatatt aatatatttg cagttcaagg 4680 aagtcaaccg ggcccactcg atattgagcg accagacgaa gcgcaacata tacgacaact 4740 acggctcttt gggcttgtac atagccgagc agtttggcga ggagaatgtc aacgcatact 4800 tcgtggtcac ttcaccggct gtaaaggtca ggaatcgcgg tatttcgcat gcctaggcac 4860 attaattaat ctctgattct ccttgcaggc ggtggttatc tgctgtgccg tgatcactgg 4920 atgctgttgt tgctgctgct gctgctgctg ttgcaacttc tgctgcggca agttcaagcc 4980 gccggtcaac gagtcgcacg atcagtattc acatctgaat gtgagtgaag ttacttggct 5040 tagccaagtc caaatataat acccgcccca acacagcccg ccaataactg aaatcatttt 5100 gatttagtta agaaaagaga aaacaagaaa accaatccgg tgcaaagcta aaacattaac 5160 tttaacgcat gcttcccatc cacccacaca ctgtactatt acgaacctaa caaaccaact 5220 ttgatgattt ttgtggggag atctgtatca tttttccttc tgagaagcta atgtttagtc 5280 aaaaacgagc ccctacatct gacatcaacg cttacttaac agctgcccga acaataaacg 5340 cctaaccatt ttttagtata attttcctgg gttatcagct atcttggcac accatctccc 5400 tcgattggtt ttcccttact cacactactc cgtacaaaca ctactctgat tgcactgctt 5460 tgtaattagt ttattttggg taagtatgaa gtttggtttt ggtttccgat tatcaattct 5520 ccagacactg aagtaatcag aatactgatt tttgagacca gaggtaaaca cgactaaccc 5580 taaaactttt ccgatccatc gaacaatttt tgtataattt attgttagtg tttgttttgt 5640 atatatgtac ttcacgaaaa tcatcaaatg cttccaacgt ttctctctcc ataaaaaaaa 5700 taataaataa cgtcaacaca tacacgctcg gatttctcga gacagattct gcgaaactgc 5760 acttcccgaa agaagtgcaa cgattattgt tttttccacg attgcttttc ttgtttttta 5820 agtgtcccat tgcgcataca aatttatttt gaaaccaccc agattagaag tcttcttcct 5880 ttgctagttt gcaagctaag gaacagaagc tataagactt acgtttctat ttttggttaa 5940 cttctttttt tttctcaaaa atctttgcga tcgactcatc atatccacca cctactacca 6000 cccacaacca ccacacattc gctctcgacc tcgccgtcag cgccccgatg gcaatcggga 6060 gggcaacgat atgccaacgc atctgggaca gccgccacgc cttgtaagta tgcaaatcga 6120 aaataatacc agcacgagta ccgcatgaaa aaggtgcaag atcacctgcg gcctgcctac 6180 cgcctccacc tggctcaagg cacccaccca cccaccaccc aaataacaca aagccaataa 6240 ccaaatccga actctactta aacggtaacc aatcgaaatt tatatgttcc aagggaacac 6300 gtcttactaa gctaacattc agtcgcaaaa ctaacttgat cgagagaaca gaggaagaga 6360 tgacaaaata aaaaaacgtc gttttttttt gaaagtactg ctgtaaatac ttggtaaatg 6420 ttttatgatt ttctttcaat gctgtcatct ggctgatttt aattttgtag tatttggctt 6480 gaagggcctt ctatccatca gcacaagagt aatgtagtgg tcgtaactaa acccgtagaa 6540 tagtgtacat acatggcttc taggcagccc accgaaagtt tttgactaaa cattggttaa 6600 gggaccttcc ctccaagctc tcaaaacttg ctgcagaaac taaaataccg gcagaaatcc 6660 taatcccact cttttgtgtc ccgcaggagg acgttgacct agacgacgtt aacttgggtg 6720 cgggcggagc ccccgttacc tcacagccgc gggaacaggc cggaggacag cctgttttcg 6780 ccatgccgcc gcccagtgga gccgtgggag taaacccctt caccggcgcc cctgtagcag 6840 ctaacgagaa tacgtcgcta aacacaacgg agcagaccac ttacacgcca ggtatataag 6900 caattggtcc agaagtcaac cacacagcca gcctccgcta gtttggcaat ccgacaaagc 6960 gtctttgacc atcgaacttg ttattgtgca tatatcgaac aaatattgtt tccttcggtc 7020 gctggatagg tggcgaaggc attcagtcac cagtcgacat accccgtcca gcgatcagat 7080 caacaccccc ttagcagtaa gctcgtgtat ccggcaaggt attttttagt cactttctgt 7140 acatgtccac gcgaatatgg gatccattta cttagatctg catgctgaat tacattacca 7200 cacactgccc gagctgcagg atgaccaatg aatagattga ttgactcagc ctttcgttcc 7260 ttaggtctgc gccagccgta ctcaagtccc ccattttaaa gacctccttt caagattgtt 7320 ttccaagctt tcgcaaatta aatgaaatga aaattttaat ttccatccaa gacatattca 7380 aagaatgtag cttaagtttg acggatttta gttttatcgt ttcttatgca acgcgtacat 7440 tttgatttct tttcacttat cactgtctac ttctctctcc ccctctcttc actctagata 7500 tggttaatca aaaatattag tcaagtgctg cgagcgagtc aagacccggg tgcttgcgta 7560 caccacacat tcagatacag acacagaaaa cacaagcccc taaatcgtaa cagaagagaa 7620 aaccagattt atttcataat tattagtgtt gtttttaaaa tgcgcgatat gttgaagcga 7680 gtaaacatag aaaccaatgt atttatttta agccgtagca cttagctcta attttaaact 7740 atcgaacttc gtggccgccc tccggtcagt caggttgatt ttggtccact tcgggcaatc 7800 gattcaccaa ttccattgtt caagccatgc atctagcctt gcagctgtac cactatacaa 7860 ctcgaatcga tagccaagcc gcgaagccac catctccacc tatccagtca tgctaatcac 7920 caaaacacca ttcggtgttg atgaaattgt atatcggaag tttgtctaga catttgcttg 7980 gggatcggat cccttgcggc ttagcccgtg agttgggatt cagtcgcgaa gagaggcact 8040 gtaattatat atatgtatat gatatgtgta tctagtttgt ggcatttgtg tcgtcgctta 8100 gtctagctaa tttgttctcg agatctcctc ttcaccgatt gtttctgtta ttttgcacta 8160 aattgttaaa gccagaacta cttatgtaaa aaggatgaat tgtcgagcga cgcgcccagc 8220 agattgctcc gatccgattc taatgcaaat gtaaatacga gttaaacaat taagcatcac 8280 acagttaaac agaggaggac gaagaagagg acactttgaa gggcgatcca attaacgcaa 8340 atataaatta tataacagat attaataatc ccctctatat ataaaacaat taaatacgag 8400 taatcaaaag ttgaattgta acccccgaaa gtattgcaca tatttatgta ttccaaattg 8460 tttcaaaatt gaattgaatc taattaattt ttgtttgatt gttaacccgc ttttaaatgt 8520 aaatgaatgc gattagtcca gcagtaatta acttttatat tgtagtatcc cattctaaat 8580 cattgacatc aattattgcc gatcagctcc gggccaggat aaaagtcctt gggcgcctgg 8640 gtctagggcc gggatccgag agtcgccggc tcatctatag tataaacacg catacatata 8700 taaaattcaa ttattaataa ttacatttta cggtcttgcg tgtgtattta ttgttcgtac 8760 aagtcaaagc aaaaatacta agtaaactaa atttaaactt aacaattgtt tccaaatttc 8820 aattgtattc tggtctaaca cgaagaagcc gaaataaatt tattatacaa actcaccgct 8880 ttaagttatt acaaagggta ttgcctagtc ggctttgccc atgaaaagct ctgccaggat 8940 ggtaaacaaa ctgattttcg gggatggaac tttcgacgcc gccagctgct tgcagttagc 9000 tttcactctc tccgctcggg ctttgttttg ttttgggaca gtaacgggcc acagccggca 9060 catgcgcttt gcccgctccc cggaatcgct gaacttcgcc gcagagtttt ccgcttcaga 9120 acccggaggc agaccgtcgg agagttcgag ccaagtgcgg gacaaagaag acttttgccg 9180 tgctgttgtt tctgccaaaa acacaggtgg aaatgtgatg tgtccgtttt tgtcgagata 9240 attcaattct gttatcattg actacgggca cgtcctggcc aaagcatcga agaaggcgat 9300 agggaatcca taggcccatg agggcagacc tggaaaaacc tctctatcgc gaaccagttc 9360 gtggcagtgt gtgtgttgcg tgtgtggatt acccgcctta ctccttcatc cgcaatctgg 9420 gcttacaaag cgagaaaagc gctgcttaaa atgcatgttt tatgcgcgtc ccttcattga 9480 ttttcctacg cacaggaatg atatacgaag tcctcaccag gaacaagggt gaggtcaacg 9540 atgaggacaa gacgcgagga agctgcttct agcttcccag cattcatagg aaggtgagga 9600 cagcccgtgc agatcaggaa attcatcccg ggaggatcct ggctcccgta ctgcccgata 9660 gcggtatgga ccgtcctcat cggtgccgca aatgacaaac taccaccacc cactacctaa 9720 catctcgact ctcagcccac acctcacgtg acgtttgacg gatctatgta gtcagtcact 9780 agcgccatct ccactttgag tgccatctcg ccgcactctc cacctcgacc accacggcct 9840 ccggtgcagg cggaggagtg ggcggcagtc ctcagctgcg cacctcacct gcgctgagta 9900 tgctgggtag cgccacggtg acggacagcg gcgttaccgc ctcctcagga tgcggtctaa 9960 ctgctcggga tcctgggccc gggcgggcca gcattgacct ggcgcagctc gcccgctttg 10020 ctttctcaca ccgaaaactt acaggcaact caccttgaag ggcttttagt ttccggcgca 10080 agctttcgac gaacaagatc atcaatacgt ccgtaacttg tcggcgcgct ccagcctaaa 10140 cccacggcat tcctgtatcc cttatcccaa ttttcaacgg gttactttca gcctgcggcc 10200 accacgccgc accacctcca cctacaccac catgggaacg gcacagctgc gggatcaggt 10260 tccggtgtgg gatcgggacc gggatctggc agcggtgcag gaggactcta ccagcagtct 10320 ctgtcgctgg cctcgatgac agcggccctg ccactggctc cgctctatgg atccccgctt 10380 gtcccctgcc gctgacctcc tctcagtcca ccaccaagct ctcctaccgc gacgactttc 10440 tgcagcagat tggctatctg ccaactacgc gcatctacag caccccgacc agcatcgtcg 10500 acgaggacaa ggcccagca 10519 51 1913 DNA Drosophila melanogaster 51 aagcttgatg gggaaagctc tggcgttgca ctgcgtcctc gcgcttaaaa ttcccaagtt 60 ggaattccgc aagacagagc ttaaatatcc tcgttgcaag gacctccagc gggcggtcgt 120 gcattccgca gcaacgacgg ttgggcacat cccgtctcat ccccactctc tcgaattgtg 180 tccgttggta gccgaagtta cgattgcaaa actttcgtca tctctctctc actctctgaa 240 gcaaaagcca aggccggagc aggtgcatat ggtgccaacc cccctggccc ccccaaaagg 300 ggggtggtgc tggtgcaatt gcaaaggcac aggagcacag gagcagctaa ataaacaaga 360 aagcaaaatc aacaaccgca aaagtctaag acatgtatca cgaattgaaa atgataatga 420 caaaatgaga aaaaacaaat atcagaccaa agttatatca atgcaaacaa aaaaatctaa 480 aaaatacaca taaaatacac taaaaaaaga gaagacttac ttaaatgtga cctagtagat 540 aaaatcatca tcgtgttatc aggacgaccg caaaaaatct ctgaaaatcg tacaaaaaaa 600 tattaaaacc tcaaaatgca ttggtgtgtg tgtcccgttc gtgtaatcgc tggcgcgcca 660 cgccccccgc ccctcccact gtcacgcgct cgccacgccc cccagttgcc gccccattag 720 cagcgggcag atggacaacg tgtgggtggc ccacaaggac atctagtaac acgacgccca 780 acagcagccg caaggtctga tacgccgccc gccacgccca tcgtgtttgg gcggcagagg 840 aagcggtccg gaagcggaaa cggaaacgga aacgggcgag caaaatggtg gcgcggtatc 900 gcgggcaagg cgacggcgcg tccacaaaaa ataaccatag agacgtttaa ggcaaattct 960 aaatgaacaa aagtataaac caaaaacaat tgtaacgaga aaacaaacga attcaaaaat 1020 gagcaatatg agcaacaact aataatggca aaaagcaaaa acgacaactg caaattacga 1080 cacaacacct tcgaaaagac ctcagtaata ataaaaacaa caacaaaaac acgtgaatgc 1140 ttggtggctg caaaggaatg ccttcgtcgt atcgcccttt ttttggactt ggccaagcta 1200 ttattgtata agtatagttt cagagaacag atgcgttgta aaaaataaaa aagtgtatgt 1260 tatattaaca tgacaagcag agtaacagtg cgtatgtgca ataaactaca gttttaatat 1320 ttttgtagca aatatcatta tatacagcag acgaacagtg cgtatgagca ataacttgca 1380 gttttattat atatttgtat taaacatcac tattaccagc atattaagat ctcgataaat 1440 gactactcgt tttaatatat tttatgcctg gcttggcatt tcaatcacac aatgatttat 1500 agccaaagta acacatcttt taaacagcag acgttttatc tattatttta ttcattgagc 1560 gttaaaataa aaagcaaatg ttaaagactc ttttgaaata tagtgttggt ataaagaact 1620 aatattaaag tttggtttgg tgcccaacta aagagttatc attaaagact attcttgatg 1680 aacacttaag catttatatg gcttataaaa ctcttaaatt gccgaaggca atacttaaca 1740 tttttcaagc caccgacgtt ggtttaaagt agccatttca ttgtgaaatc agcgagctca 1800 ttgagtgaat cgttttatga ctgagttgtt cgcaacccat aattacgata aatgaggcaa 1860 aatattcatg acacgtgccg attgcgtata tgaatggacg aggaacagaa ttc 1913 52 7396 DNA Drosophila melanogaster 52 tgaggtaccg accttgggaa gacctaaaag tgggtgaggt ttactatttg gaacagttca 60 tctgccccca aaaaaaacga cttacaatcg aggtgatttt gtatattgtg tacccgccct 120 tgtgctcctg ggtgtccgtc actgtgaagc tgtggatcca gccgcccagg ttgagagtca 180 ttctggtttc tgctgcaaga cctggtggtt tagtttctaa acatatttcc catatcaaat 240 gtcgttgttc cccaactttt ttgcaacatt aaagctagtg tgaccaaagg acgccaaaat 300 aaccacttcc ggacttcgaa attaaataag agcatttagt agcttatttt tttaacttag 360 aaggattaaa ttaaaccata aaatattaac tgttttttaa acacttttat ttaaataaaa 420 acggtcacat tttaaaattt aaatttatac tctaatcgaa agtatcataa gtatcgatag 480 tctgcagtgc atgtgggtta tgtcatgact aatgacaaat tgaattgcac agaaaaggtg 540 gaagtccctg ctaacgacat tgcttcatgt gttcatttaa tacaacatat ttatagctat 600 agctttttta aatgtgtcaa tttactataa cattttaaaa cttccgaata aaagcttcac 660 acaaacggat tttcacaagt aatatccaga tagaaatacc attttaaaaa gaaatttttc 720 cgattttacc accatcgctt gcgcatgcgc acggaacgcg tgatgcgttt ttaagtccgt 780 cattaggcca tcgcccatca attttagtgt ttcaaccgcc gcccgatacg gtcacaccgc 840 tcgagaagta gagaaaaaag aaggcgacga accacaggca gcaagcaaaa tcgacgccgc 900 acgtaccggg agcatttaaa taattttttc tgtggattaa ttttaccgaa aaggcttcaa 960 cacaatgact gagcgcgaga acaatgtgta caaggcaaag ctggccgaac aggccgagcg 1020 ctacgacggt gagtagtgat tccgaaaaaa aatggtagca aaaatggtag aaaacggtag 1080 acaaatgcta gtagagtagg gaaaaaattt tctcgactga ggagcgccaa cgtttttggg 1140 gaggggtcgc ctgggatttt tttttgtctt cccgatccat tgcgtcgtcg cgtggtcgtt 1200 gaaaaagtct agagctccgc ttttcgtccg aatgtgccaa ttaaaaggtt ttcttccaga 1260 gaaaacgatg gaaattttca attttcatcg ctgtagctgc attcgaagcg acgcagcgtc 1320 gcgatgagta gtatttcttg cgaaaagaac gggagatgga ggacacagta agaagaggaa 1380 gaagccgaag aagaagcaga gcgtcggaag aacgaaaaga gatgaataga gtgtaaaacc 1440 tgatttttac cggcaaaaat tcgcgtggtt ttgcacaaaa agagaaagaa acggaattgc 1500 gtattataat ttacagcaaa tagtttgctc cgctctttct tttttggaac tatatagtat 1560 tagtttagaa ggggaaaatt ttgtgtattt ggcgttgcga gcaagggctg cgcggcagct 1620 agcattttat tttcttcctt ggttgctaag ggctgtttgc agttgattcc agttcgcgtt 1680 cttttttttt ctttcttagt agttgtcaag gcgtaacttt ctagaaacga cacatttttt 1740 ggggattcta gtttcttaaa tacacttatt ccacacccaa aatcaatgat tgtggggcag 1800 gggtaggtgc agtcctctct ttcagacccc ctcccccttt ttttccacga aatgtgcttt 1860 tcatgcgcaa agcggcgtcg tcaccatgcc gttgctgcga aaaatcgatg aatgaatcga 1920 ttgtgtcgcc ggctgtctcc ctctctcact ttctgcctat tgcttaattc tttatttatg 1980 caaattagtt gcggctagtt gcaagcagtt atgcggccac ttggcaaccc tatctaataa 2040 tgcgaaaaaa gggaatataa aattttactg tgcatcttct gatactctcg cattactcat 2100 gcttatgaaa tgtatgcatg cgtgtgtgtg tgtgtttgta cggaggtgta cgtgtgccaa 2160 tgagtgtcac cgggcgtttt attgtgctct tctgtagcct tgcttatatt taattaaaaa 2220 tactccaatt aattaaaaag ccagctcaaa atgcaaagaa agtcgcagtg ggagggggac 2280 ggcaaattcg gcatttgcaa aaagcgcgcg cttaaccaat tgccataaac gctgcgtgcg 2340 ccaaaacgag atggctagtg tgtcgaagtg tcggcgacgc tgcacagtgg ctgtaaaaag 2400 ttaacaaaca tttggcgatt cctctgtctg ccgttccttc ctgctgctgt acttgggcag 2460 cgactatagc aacaataatt gcaaacggta ctctgcacct aattgcctgc aaaactgaca 2520 acattaacgg gcggtggcgg cggcaaagaa gtcaatcaac tattagcaca cacaaacagg 2580 gcatacatgc atacatacac cactggtggg gatatttgtg tctgtatcac atgtttgtct 2640 tcttattttc gcccatcggt gtgcgttggt gtgcagcact tggcgttggc cttgtcacac 2700 tcggacattt tgcggtggag ctcatgctgg agcaccaata ccattgcgcg cgaagagaga 2760 gaaagagcgc ggcggagagc gaagcgggag cgcaaagccg ccagccggca atgtctgcgc 2820 ccccaatctg aacttgcctc gccctctccg cccctgatct catctcctct tcaaacccct 2880 gctccccttt tctgcacaca ttaacgtcag cctttaagtg tgctttctca ggtgctgccc 2940 cctgcgccca ccatcccccg ctccatgctc tttccatctt gcgctctctg cgttctatct 3000 acattttttt cgaggtcgcg cgctgctttt tccgttgatg ttcgttctcg tcaatgtcgt 3060 caatatgcgc aaaaggcaga caaaaaaaaa atgagtggaa aaagtagcat acatacggtt 3120 gattgatggg cggtgggtgg cggtggtgtt aggtgtgttt tgtttgttgc tcgctttttg 3180 ctcaaattgc accattttag gggctgttgt tgctgcttcc tttgtctaat tacacattct 3240 ctgcccgctg taacggtatg tgcgcatgtt ttttcagctg ctgcgccctt atgcgactca 3300 cacacacaca catacacaca cacacaaaac cacgcgcact catacccaca cgagcgcaag 3360 acacacccac aggcgtctca taagggaggg gtggatctct gggggcgtgt tggtaggagg 3420 gggtggaaaa tcctgcccat tccttttgct ccctatttcg tcccaatatt tcattccagc 3480 tcgcgtgact tctttttact tttgggtggt ggtggattgc ctttgcaggg ttgtcgctag 3540 tttatcaaat tccaatactt atttcttgga aaaacaacac aattgagatg accacgctca 3600 tctttgcaat caacttgttt atgaaaaaga ttaatcattt tttataataa aaaagatgaa 3660 tacttttcaa atgtagttca actactcacg aatctactta gaatctattc catactattt 3720 tcgccaaatt tgctagactt tcgtgaactt tgcactttct actgaagttc atatgattcg 3780 agaagtttcc atacatttct ttttaccact tcgtggttcc gcaaatgatt tatcatgccc 3840 gcgggacttg gctctttatt tttgggacgc gtttattgat tttcgccttg ctatatgtac 3900 atgatttagt gactggtctg acgcgcatgt ctcataattg gttgtggatt acttactact 3960 aagcaacatg tgctgtatac ttttaaagtt gcaaggatcc cagccaccct tcaccaccca 4020 actgcgctcc ccctcccctt ccccctccct acccactctc aagttcattt tttatgcgta 4080 actcagcata gcttgcaagc gttgtaaaat gcgtgtgcac atcatcatcc acattggttt 4140 caaacagccc gccaatgcga acaacaatgt ttgaagttca atcacagcca caaatattcg 4200 tgaatcatgc aactttattg gcaatttttc acaaatacgt gctacaacaa aaatcgcact 4260 aataagcaga cagtcggcaa atgtcccgat gtctttgtgt gtgagtgcta gtgtgtgtgt 4320 atgtatgggg gtgttggtgt gcgaaatacg tcagcagctg acgggcagcg ggtgcgagcg 4380 agagagggca gacagtgcaa gtcagcaaac gcgctacaat tacaatattt tgaatagtag 4440 tgtagtctag ctctgtattg ttatcagcca ccagatatgc ccgagatagt tagcctgcaa 4500 ctggctttac ccagacaggc agcactgcaa tgataattta gatgatttgg ttcggaaata 4560 tcttacattt cctttggtca gatatcaaaa actaatcttg cataattcca aatactaatc 4620 atatgcccga aaagcattta aatgaacttc gaaatatatt tagacttcaa ttgactgact 4680 cgaaaaagga aatgaaatgg ccaaatgata tcagctctta acctactcac tttaatttga 4740 taaacgagct ccttgtttgc ttccgtgttg agttccgaac atgtgattca tgacgcagtt 4800 ttagatggtg atgcagacgc agatagagat gcccccgatg ggacgatagg attgggtgca 4860 acaatgtggc ggaagccctt atcaccggct gattggatgg ccccgtattg gtattggtat 4920 tggtattggt atgacagaac cagctaaatc aggcgtagac gcaacgccac cttatgcata 4980 aatatatctg cataacttat atatcatcgc ggattgggcg tgtcggcgat aagaaatgga 5040 aaataggtag tctaggctgt cgcgagggga caaggacaag cgggaggaaa gtccagataa 5100 caagccttcc ataccttgga taaatttatc tacgcaaaat gtatataatg tatgtatctt 5160 gttggctttc caattgaaat gtttggtttt tatttttaat cttaacatga aatatatgaa 5220 gcaattcaaa attatatgac atcaacaaca tttgattcta cgaattgttg ataaaatagc 5280 gtccgatagc attgtttaaa ctacatcata aacattactt cttactcact caaatgaagt 5340 atttgtttaa tgatgtatat tttccttagg cattgcagaa tgaatataca tatatgcgaa 5400 ctaatgatta aactaatcgt tagttaatgg gcaatccctt aaatttggtt tgagatgtta 5460 cacttgatta ttgtaaacat ttactcttat atactaaaga ccacaccttt catttgttat 5520 gtcctcttta cagaaatggt ggaggccatg aagaaggtcg cctccatgga cgtagagctg 5580 accgtcgagg agcgaaatct gctgtcggtg gcgtacaaga atgtgattgg agcacgccgt 5640 gcctcgtggc gcatcatcac ctcgatcgaa cagaaggagg agaacaaggg ggccgaggag 5700 aaattggaga tgatcaaaac ctaccgcgga caggtggaga aggagctgcg cgacatctgc 5760 tcggatatac tgaacgtgct cgagaagcat ctcattccat gcgccacatc cggcgaaagc 5820 aaagtattct actataagat gaagggcgac taccatcgct acctggccga attcgccacc 5880 ggctccgacc gcaaggatgc ggcagagaac tcgctgattg cctacaaggc ggccagcgat 5940 attgccatga acgatctgcc accaacacac cccatccgtt tgggcttggc attgaacttc 6000 tcggtgagtg attgtgaaga gacacgatgt cgtcatatgt tttctatata tatattttaa 6060 aatccttgca ggtgttctac tatgagattc tcaactcgcc ggaccgcgct tgccgcttgg 6120 cgaaagccgc tttcgatgat gccattgccg agttggatac actgagcgaa gagagctaca 6180 aagactcgac actcatcatg cagctgctgc gcgacaacct cacattatgg acgtccgata 6240 tgcaggcaga aggtaagttg taatatcaca tccacttgtt cggctccgtt attaagttct 6300 aaatattata ctaagaagta gaccccaacg caggtgatgg cgagcccaaa gagcaaatcc 6360 aggatgttga ggatcaggac gtgtcgtaat ttaagtaaac ccccgccctt ccatttaaaa 6420 cacattatat accttctaaa gtaaatataa taccatataa tatatatgca tacaacaaca 6480 acaacaaaat caacaacaac acgacaacaa cttcaagaac aacaacatgg aaccaatgcg 6540 gcaacaacaa ccatagcagt cgatcaccac aacagcagca cccgtatgtg tatcaacata 6600 tttgcaacat gataacagca agcaaagcag caggaacaca aattgtcatc cacaataaac 6660 agcatggaaa acaacagcac aaaacagaaa aaaaaacaac tatccaagca aaatgtttga 6720 aaattgggcg gcctggtggg cgtggcgcgg ctgttgctgc ggccccagaa tcgaatcgca 6780 acttaaaaat tcgcggtact attaagttgg cgccgttcgc aatttggctc cttaacgtta 6840 gaaaactgca acatcaagtc tactacattt gaattttttg cattcgagaa accaagcatt 6900 ttattatatt tgaataaata ttatatttaa ttatgagaag caaaaacgaa acaaataaaa 6960 ttcatttaaa ccacaaaaca cattaaagca caagcagaaa aagaacgaaa gtcgaaacac 7020 taaacgaaaa ttgagaaaca atcacacgag agtaaatcac aaagcaaatt tcatcgagtt 7080 ggcatgcatt aattttaaaa caaactttaa atacgacagc aacacgtaaa atacaacttt 7140 aattttaagc caataataaa cgtgagaacg aggtaataaa atgaaaaacg atttaataag 7200 cgcaaacgta aaagtaaaac ataaagtcga aaatgtaagg aaaaaataca gcaattaatt 7260 taaatttaat acatgaataa aacgtaaatg gagtaaacga aacgaaatga gcaaacagaa 7320 tgaaagacgc aaaaagctaa aaactatatt aaatgagaaa aaggaatgtt agttaagcga 7380 aattcaaaac gaattc 7396 53 1467 DNA Drosophila melanogaster 53 ttttttttac taagactgca aaactgtgag gaacataccc ggcatacact gtggatatca 60 tacgagacgg agaaatcttt taaaatactt tggcaagcca gcttataaca atcaatcttc 120 aaaaattgga ttgttgtaca ataataatat tgttaggacc accaaccgac tgagaacgga 180 atcagctcac ataaacttac tgctacaacc cgaaaaaatt ttaacaagga ggcagaaagg 240 aacgttttgt tgagaaaaat gacacttcct agtgcggctc gcgtgtacac agatgtcaat 300 gcgcacaaac cggatgaata ttgggactat gaaaattatg tggttgattg gggcaatcaa 360 gacgattatc agttggtccg taaattaggc cgtggaaagt attctgaggt cttcgaggcc 420 attaatatta cgaccacgga aaagtgcgtt gttaaaattc tgaaacctgt taaaaaaaag 480 aagataaagc gtgaaatcaa aattttggag aacttgcgtg gaggaactaa tataataaca 540 cttttagccg ttgtcaagga cccagtttct cgaacaccag cgttgatttt tgagcacgtc 600 aacaacacag atttcaagca actttaccaa acattaactg attatgagat tcgttactac 660 ttgtttgagc ttcttaaggc acttgactac tgccacagca tgggaataat gcatcgtgat 720 gtaaagcccc acaatgttat gatagatcac gaaaatcgaa aattgcgcct tatagattgg 780 ggacttgccg aattttacca tcctggtcaa gaatataatg ttcgtgtggc ttcgagatac 840 tttaaaggtc ccgaattact ggtagattac cagatgtatg attactcact ggatatgtgg 900 tcactaggtt gtatgttggc gtcgatgata ttccgaaaag agccattttt ccacggacat 960 gataactatg atcaattggt acgcattgcc aaggtgctgg gcaccgaaga actctacgca 1020 tatttggata aatacaatat tgacctcgat ccaagatttc acgacattct acagcgtcac 1080 tcacgaaagc gatgggaaag atttgtccat tctgacaacc aacatctagt ttctcctgaa 1140 gcactagact ttcttgataa acttttacgc tatgatcacg ttgatcgact cacagctcgc 1200 gaagcaatgg cccatccata tttcttacct attgtcaatg gtcaaatgaa tcccaataat 1260 cagcaataag aagttttttc attttgatga atactgtaat tcgagtttgg gatagaagcc 1320 atttaacaat atgataatta tgcaaaaaaa aaatataaaa tccgaagaaa aataaaataa 1380 cttaagctac tgtcttaaaa atgacttcaa ctgttcgtta agggattaaa cagagaaatc 1440 gaataaattc ctcacagtat aacatct 1467 54 2786 DNA Drosophila melanogaster 54 ctggcggaca ttgaggcccg ccaccaggac atcatgaagc tggagacatc gatcaaggag 60 ctgcacgaca tgttcatgga catggccatg ctggtggagt cgcagggcga gatgatcgat 120 cgcattgagt accatgtgga gcacgctatg gactatgtgc agacggcaac tcaggacacc 180 aagaaggcgc tcaagtacca gagtaaagcc cgacgaaaga agatcatgat actgatctgc 240 ctcactgtgc tgggcatctt agcggcctca tatgttagca gttatttcat gtaaatctcg 300 aatgcaatta cttacacgcc acattcactc cacacacaca cctctccaca cacacacaca 360 ttatatatct catccagacc cgatctatat acataggatg ttgcgcaacc cacccagaga 420 aattgcagaa gaccaaatcg aaacttaaac tgtaagagac cgcaaccagt gaacccccga 480 tacaaaaaaa aaaaagaaat gcacacgaaa ttagggctcg attacaccaa ttatagctat 540 tataatatat agagaaatca tgtatggtta gtaagcaaga gtctcaagtc cccagagtct 600 atcccaaaaa cacatgaaac tcaagttata gctaccagaa tctttagtac ttgtaaacaa 660 attgctaatt gaaaatttat aaaaccaata cgaagagagt tgaaacacaa aaactacaag 720 acacgtaatt gcaaacaaag aaaacatgaa attttattga ttattgttat aatgcgaagc 780 aaaaaaaaaa aacgagaatt gtaataaaaa gaaaggaaca acactactcc actcaaaaca 840 cacacaaaac acactcattg caaatgcttt gcagggaaaa gtgaaagaaa atcttgttga 900 tttcctcgtc gttttaccga ctttgaaaat gcatttagtt tttgttcgtt tagctagttt 960 agttctcctg ttgaaaactc acacaaaagt agcctcatca ctcacacact cacaacaccg 1020 atgaagaaga gaaagatgca acaaatctct agttcacttt tgaatattta tgataaaaag 1080 attgtttatt atgcgatata ataactatat tgactattta ctataatgct atttacttaa 1140 catgttcata tttattgaat gcatatgaaa tatataaatt atacaaacac agcgcatata 1200 tatatataca aatataaacc tagataaata gctaactcgg acattctcaa tggagatgat 1260 aatgcgaaac gaaaacatga gaagttgagg aagttgatga tggaatgccg attttaaagg 1320 ggtaaacaat tcaatcacac tcacatatat agacagcttc aaattaatcg gcaaacagtt 1380 aaatttaaag ttgattattc agtttttttc tttgttttgt tgcgcaaaca attgaacaag 1440 tttttgcgtt tgttgtttaa tctaccggaa aaactagttt ccttttgttg atttccgttt 1500 tttgtaatcg ccagaaaagt taacagaaat taatggattt cggtttcagc tatgatttca 1560 tttctcacac tcacaccgca ccgcacacac acagtttgta atactttttc tttctaatta 1620 acctaattta tttttccttt ctttcctccc accgattcac atctgaaaaa ctaacaatcg 1680 gaatgcaaat tcaaaactat actactcata cataacatct ggcatgcggc tgtaaaatcc 1740 aatttaacaa ataaaatcaa accaatcaac tcaacttttt ctgcactcac aatgaaaaac 1800 gaacaaatca tatagaatcc aagcatccaa gtagcatgct catcaaggcc tataatctgc 1860 attatctaca cacgttttgt agtacaattc ggaaaatgca aaatcattat agttactaat 1920 aacaataaac gcaatgtgta tgagtttttg acaattattg tgattaagat acacgaagaa 1980 atgctttcgc ttctaagtag gaaatgcaaa tttgttaatg caccaaacga actggtcacg 2040 cgaattggtc gatttctgtt ttgatcacac gaaattacca actgcaaatg tcagacctgc 2100 attacatgca cgtttatttc ttattcaatt cgttaattca ccttttgatc ttttgaacga 2160 gcaacgaata aatgggaaaa tgaaaaccca aaactgcaac tacaacagca atagcgcaaa 2220 tcagcacgtt ttcaccacca aaacaagagc aacaacaact gcaaaaaaaa aaaaaaaaca 2280 cccaagttaa ccgaacgcgt tcagtgaaaa gaaaatcaaa agaaatatta ctacttcgta 2340 ttatgtcaat caatcaaact gcgatttaat tttaaccaca taaaaatgca atctaaaggc 2400 ccgacaaatt cgaaaaatga ttaaaaataa acgaaaaacg cgactccgcc ctgccacgcc 2460 cttacgttca ttggaagttc cagtccgggt tccagtgttc cagcgctcct gacccgcagc 2520 ttccagcgcg aaaaacagct gcattgctcc attcccacac gctgaggtga gtaatacgct 2580 ataattattc aacatttagc attcttttta tcggcaattc ccgcaattac tctctctacg 2640 tttacgtttt ccctcgcggc gaccattgaa aaatcgccga aggtcgcgtt tccatgtcct 2700 ttaaagtccc catcgccacc ctccgcccct ttttgaaaca cccacccgcc cccttgaagt 2760 tccgcccatt ttctattaat gaattc 2786 55 599 DNA Drosophila melanogaster 55 ggtgttgtat gttgttgtat tttgcttgtt ttgaataatt taataattta atttacaaga 60 ccttcttgat ctcatcgtac aacacaagca cgaaagcgcc accagttcct ctgagaatgt 120 tggagaaggc gcccttgaag aaggcaccgg tgccctcctg cttggcgatg gtggcccagc 180 agtgcagtgt gttcttgtag atgacctcgg tggccttgcg accagactgc atcatcatgc 240 gacgacgcac ggtatcgaag ggataggaca cgatgccagc gacggttgtc acaacctggg 300 cgatggccca gctgatgtag atgggtgtgt tcttggggtc gggcagcata ccgcgagcgg 360 tatcgtagaa gccgaagtag gcggcacggt agatgatgat gccctgcacg gacactccga 420 aaccacggta caatccaacg atgccgtcgc tcttgaagat cttggtcaag cagttgccca 480 gaccggtgaa ttcacgctga ccacccttgc cagtatcagc agccaagcga gtacgggcaa 540 agtccaaggg gtagacaaag cacagagagg tagcaccagc ggcaccaccg gaggccaag 599 56 4494 DNA Drosophila melanogaster 56 ctgcagacgg gccttcaaca ttctgtcgcc gcacggcaaa tgcactggtc tcgcttttgg 60 cccatccgaa tgggattatg cggtcgattg gcggccctgc acgcagattc gtctaaaaat 120 aaaaagagca tcgcacacac ttcattggtg tcaatcaata aagttgtaat atatttattg 180 cattttgatt gctttggatt ataacttttt taaagcccat ttacttgtcc acaactttaa 240 agttttcgtt tgcaggaaac ccttttatgt aaaaaaaaaa cactttgatg taagttaatt 300 gttcttaaga taacgaattt cttaatagaa tgattgaatt ctaagtttca ataattattt 360 taagtaaccg ttagttttca aatttccgtt acaaaaagtg tgaccattgc cattaagagt 420 gacgcagcgt taccagatag tatatctgcg gcttgtttta gggaataatt attggacgaa 480 tttttagggt tggtaacact aaggcacaga attacaaaat tttttggtgt acgtttgtcg 540 tggtgtctct gttcctctgc gtttcgcatt acactctaac caaactaaaa ctcatcaaaa 600 gtaagtagct caggagtaaa ccccctcaaa attcagtgaa atgtgatgga tacccctatg 660 ggtagaagga aaatcaggat tttcggtgcc agtgatttca taacctcgaa atgaaaaaat 720 tgcaagtcga cacgcgcatg aacgaaaagc aaaagggctg agcacttttt tgttggatga 780 cgatgggcga gcttcttatc accaaatcac gcaataatta attgataaat ccgaaattta 840 aaggggatat taagccggtt agctgtcaca cacacatcca cacatgaaca gtccacgcac 900 accaatgcct tcgcagttca ttggcagcag ctgccggcga aaaaaggggg caaagaagca 960 gagagatcga tggaaaagtg cctggcgaat ttccgttacg taacgcatgt cgcatcaatt 1020 ccgttcgtta tcgtcatgtt gccatccacc agcgtgcagc tcgctcaata aacattttcg 1080 atttcactct ttaccgttct ttatcgtcgt tttgtttgtt tgtggtcatg tgaggaggtg 1140 ggtgcacttc gcgattgtgg gtggtatttt cgcagtgacg tcacctggaa aatcaataga 1200 ccaccttttc gggcagctat tcacttagct ttcctctttc cctatcgatt tccctgctgt 1260 tatgcaactg tcgaaaaaat gagctgcaca tccacccaac cacccccttt ataatctgct 1320 tgtcaaggtc atcgtcaacg agatgctgat tcattttcac cgggacatcc acctgatagt 1380 attattttca gcatctagtg accaggttat atcacctgac ttccggaaca ataagagccc 1440 atcccattta acgtttggct gccggccata aagatttttg tctgctggcc gctgcttttc 1500 ataattttct gccttgattg ccttgagttt ccccccacca atccccccga acacaacatg 1560 aaatatttat attgttatat ctatatgaat aatttcaatt agatgtttgg gttattttgt 1620 ttgatttttt tatgctcaat atcggataga attattatcc gcgactaata ttcctaccgc 1680 actgtgagtc gcattcctga cgtcacgcaa atagaaacga gcaagcagtg accacagcag 1740 acgttgcaaa tgtgcagtta acttctgagt caatgatcgc tagctaacgg agctgaactg 1800 accagcaatt aatgccagca agaacggaaa tggctaaaca tagagagagg caaacctgtg 1860 gccagataaa tagggaaaat ctcagtattt ttattaaaat tttagtttaa aaaacctata 1920 aaaatatttg aaaaataaat tttgatttaa atggttttgc attggtatag aaaaccctta 1980 atttctggag taaaatagac ttagtcacta tttgttttct atcttcgctt ctattttttg 2040 cgccacttta ttttgcccag agcgtcattg accccctcct cagtcactca gaattttcgc 2100 tgcttcagtt ggtgattctt tgcatgtggc cgccttgaac actaatctaa attgatttct 2160 ttattttcgc cgcacaaaat tgacatttag ccaaagtgaa aatggggaaa aagccctcga 2220 aattcttggg acatgttgat gctgtttttt ggcgtctttt agttttcact tttcatttgc 2280 ggccgataga cattttgaaa tcaatgcagc gaaaaatcgg aataatcatt atgttgctga 2340 taagattaaa cttatcgttc cgtaaacgca attaatgacc tggcgagtga aatgatttcg 2400 caataagtat catcgtcatg attatcgatt attcaccaac taattgaagg tgcgatttaa 2460 tgcattccat ttcaatccat tccagatggc tcctccatca tacagcgatt tgggcaaaca 2520 ggctcgcgac atcttcagca aaggctacaa ctttggcctg tggaagctcg atctgaagac 2580 taagacctcg tcgggcattg agttcaacac agccggacac tccaaccagg agtctggaaa 2640 ggtcttcggc tccctggaga ccaagtacaa ggtcaaggac tacggcctga ccctcaccga 2700 gaagtggaac acagacaaca cgctgttcac cgaggttgct gtccaggatc agctcctcga 2760 gggtctgaag ctgtccctgg agggcaactt tgctcctcag tctgggtaag gatcacatgt 2820 tattacatat atgttcactg cagaataatt aatattgcat atttaattaa aaatcgtcag 2880 aaacaagaac ggcaagttca aggttgccta cggccatgag aacgtcaagg ccgattcgga 2940 tgtgaacatt gatctgaagg gccccttgat caatgcctct gccgtgcttg gctaccaggg 3000 atggttggcc ggctaccaga ccgcatttga cacacaacag tccaagctga ccaccaacaa 3060 ctttgccctt ggctacacca ccaaggactt tgttctgcac acagctgtgt aagtattgca 3120 attgccacgc atggcaaatg aaatcaatgc aaactaatga ttgccttgtc cggtgtttag 3180 caacgatggc caggagttca gcggctcgat cttccaacgc acttcggaca agctggatgt 3240 gggtgtacag ctgtcgtggg ccagcggcac cagcaacacc aagttcgcca tcggcgccaa 3300 gtatcagctg gatgatgatg ccagcgtgcg cgctaaggtg aacaacgcca gccaggtggg 3360 tctgggctac cagcagaagt tgcgcgacgg agtcaccctg accctgtcca cgctggtcga 3420 tggcaagaac ttcaatgccg gcggccacaa gatcggtgtg ggtctggagc tggaggccta 3480 agtccgtggt tttccgctag tcgtgaaatc cttgaacgtt gtctaacatc cccacacgcc 3540 caacaacaac aacaacaaca ataacagcag ccagacaagt gcagaaacaa caacaagatc 3600 aaacaaaata aaagatacga tattctttaa aaacaaacaa cacacagtca aatataaaga 3660 gaatctgcac atcttagcga ttcgattgga ttggattagt tgtcagccgg ccatcaccac 3720 atccgcccat tccctgcccc agccacgcga atctgcagga gagacactgc caaacaaacg 3780 atgcaggggg atggatgggt ggtgcaatcg atgggtcggt caggtcagtc cgtcgtcgtt 3840 ggatgtagct ttgattagtg cattgaatat tagcgaaatt gttgttgaca aaagtgcagt 3900 aaaaatacat ttacctatta aacgaaaaaa aaaaacaaaa tggaaaagac ggaatattat 3960 ggctcaaaag gcattgcatt tttttcagtt ctcaaggctt tacggattgc ctgcaaaatc 4020 ctgcggcgaa actaggacac acataacacc aaaatcacat ccttaataac aatctttacc 4080 ttttcgttgt gcatttacct gagaaataaa tcgagaagaa ataaattatc ttgattgcag 4140 cgttggtggt ggaaggcttg ggaatggtta gcttaattta aggatttaag cttgaaagaa 4200 aaaactgtat aataagataa gtaccttata ctgatcccaa gttctttgtg cttgaaaaac 4260 ttaacttttt cggttgggca caggactccg tgcgttttag ttcacaggtc tccagcgtgt 4320 ataaaactgc atactgatca ggcagcgtta gacataaatt ttcggctaat cgaggctgtt 4380 gaaaagtaga caaaaaattt cgaaattagt cgggccatat ataccatacc acctattcat 4440 acagcccgta atggccgcca aaacaccgac atacccagat ccgttgacct gcag 4494 57 3084 DNA Drosophila melanogaster 57 ccagaaaaaa aaatacaaga aaagcaaagc aaagcaaagt aaagcagagc agaacagagc 60 agagagagcg actcgaaact ccagcccaaa tgacggatac gggtctgaca acgaatcaag 120 cgccgaaatg tgtgaagtgg ctgtgaaacc cgaggtaatc ctaactaaaa accaaaccaa 180 tttacaataa aatgcaaatt actaagaata tttaaccgaa acgcaccaga ccacaccact 240 ccactccaca cacacagaca gacacacgag acacggcaga aatgccaggc ctgaaattca 300 acaagctgta cgactaaaac tttaagagta cttcggcgag ccaaactctt atataacaat 360 atatatatat atatatatat atatattaat atatatattt tcacatgcga gataacaaca 420 gcaacgtttc ctttccgctt ttgccacaaa taataccccc tgagacggtc tttgctcgct 480 cgaaactcag cacatgcttc agcacttagc cctctaagat catctaactg gatcctgatc 540 aagctcctag ccaaaactcc aactccagaa cctctaatct acaaagaaaa acaaacacag 600 aaaatgccgc gctgcctgat agccaagaag tggaaggcct atccctggct ggatcggacg 660 gaggatacat gcaaccagca acagcagcag gagcagagtg cccccaactc gcctcgcgaa 720 ctggaggagc tgcatctcaa gtccagacgc tccaccctgg acgacgacga ggagatcgat 780 gtggtgggcg acaaattcct gatcaaattg gagaagcaac gcacaacagc agatgctgcc 840 gcagcagcag caacatcatc cgaagcagca acatcgcaca gcagcaacag cagcaacatg 900 gaggcatcgg cgacgacgac gacgtccaag tgttggggac ccagttcacc cacggcgggc 960 accacggctc caagtccgcc accccattcg ccggaggcgg cgacgcgagt ggccggcaat 1020 gtttacaacg gctacacccg cgaactttcg ccgctgcact acaccgcgta cctgccccgc 1080 atggagagcg agataaccgt gatccgtgcc gcggccacag ctctggtggc agccaggaca 1140 tccggaaacg gaagtggaga tcagcacctg gccgtctacc aaacgccgcc ctcgtccacc 1200 acgtcatcac cctcttgttc gcccagcggc gccggcgatc ggtacagtcc gctctcgagt 1260 ggacagacct ccagtgaaag gaagtgcttc agcagtacgg gagccacact gtcgctgccg 1320 ccgaagaaga aggacatcta caggccgtat tccttggacg acaagccggc gcacggctat 1380 cgaagaaggg taccggcgga ggaggacctg catgcggcac atgctatcct ggatctgagt 1440 gcgagcacag ccttccaccc gcccacgcaa cctcaccagt tgcagcagca gcaacagcag 1500 cagcagcagc agcatcagca tcaccactcc cagcagcaac accttgcccc gcagcaacat 1560 cactacttgc cactgcagca gcaacagcag cagcaggcac accacaccca cttgccaact 1620 ctcgaggcgc atgcgcatct gcgcagcacg tcatcgattg ccgaactggc tgcagcggcc 1680 agtgtggtaa atgagcaacg ccccgccagc aatgcctcga gtgccagcag caatcacatg 1740 cccagcagtc ccagcagcaa ctcgagcagc agcagcagcc aggtgcagaa cgagaacagc 1800 aacaccacca acacgaacca ggatggggat ggatgtcttc aggatggcga gcacaatggc 1860 gcctccggag cgtcggccaa aacggtggcc tacacctacg aggccttctt cgtcagcgat 1920 ggccgttcga agcgcaaaca tgtggcggat cctgcagctg ccgccagcgg agtgcccact 1980 cctgaccagc agaagaccaa gtacacgtgc tccgagtgtg gcaagcagta tgccacctcc 2040 tcgaacctgt cgcgccacaa acaaacccat gtggactcgc agtccgccaa gaagtgccac 2100 acctgtggca aggcctacgt ttccatgccc gccctggcga tgcatttgct cacgcacaag 2160 ctatcccaca gttgcggcgt ctgtggcaag ctgttctcta ggccctggct gcttcagggt 2220 cacctgagat cgcacacggg cgagaagccc tatggatgtg cgcactgcgg caaggccttc 2280 gccgatcgtt ccaatcttcg agcacacatg cagacccatt cggtggacaa gaacttcgag 2340 tgcaagcggt gccacaagac cttcgccctg aagtcatacc tcaacaagca tttggaatcg 2400 gcttgcctga aggacgagga ggagctgatg atgtccatgt cgctgtcgat gcacgacagc 2460 aatagcgaga gtggggccag tatggcctcg tcgccgccac acgagttcct tgaagcgcgt 2520 caaactcgaa gggcagtgga ggagccactt atgccatgta agcatagaat acattgggcg 2580 cgctacagaa atctctaagg acttgaagcc gcaatccggc tcgggtgacc agacttagac 2640 atctctgttg cacaaccgat aaacagaagc ggaaggatgt gctgtaactg catagctggt 2700 tagttaattt accgttaaag ttaaagttac cgttaccatt tacaactagg ttaaccgata 2760 gccgttttta tagaagatac atacatcaca atacgtccaa gagcaaagtg atgatgagta 2820 acacagagac tgatgaaatt cgcattaggc agccaagcaa caaaatactc aatgggtgtg 2880 ttttcaaaat ctgttatttg gtttgcgaac tcattttttt aaaaggtgcc tctaaacgta 2940 tgcaaaggaa atatccaaat tcatggtcag attttggaaa caaccccata ataaccaact 3000 caaaatggtt cttcttttaa gcataagtga gttgaacatt cctaaaaacc aaagcattcc 3060 acaaaacaaa aaaaaaaaaa aaaa 3084 58 9567 DNA Drosophila melanogaster 58 gttgctgtcg ataatttact ctgtgctcca catatatata tatatatata tatatatata 60 ttataaataa aaaccaacca atagcccaaa aaaaaaaaag aaatgaaaaa tccgcatcaa 120 caataaaaat ctgcctgcat ttttgccttt tgtgtgagct gcccagcaga acgagagaag 180 cacttttatt gtatataaaa attatataca tcgccaggag gagctgcagc aacccactcc 240 aagccaggtt gccacgtcct gagctgctgt aagttctccg cagcagctgc agcagcatca 300 gcatcgcagc agcatcagca gcagcgcaca atccgccgca gcatcaattt ggcttttggg 360 cagagataat ttaagacaaa tatatgtgat gctatgcaca tcagcagcta tgaaatatcc 420 ctagaacgcg ttgctgaaga atgtatgggt cgcaggcaat ggaaacatta tcaagacaaa 480 ctgacgtgca gccacttgaa tatcgaggag caacagccca tagcaatagc cggttccgag 540 gacgagccat cgcaatacaa ccacagcagc aaggagatca gccagagcaa tcccaaccac 600 tgtaagacag agaaccaccg tctggagcag caacacaacg gcagccagct attggaagaa 660 gaagattctg agaacaacca aacatcacac gattcatcac gtacaccaac accgggagcc 720 accagtacac catcaccacc gccagaaccc atcgattgga gaccgtcggc caagtgcaac 780 ttctgtgtta acggtcgcct gctaacggtt aacgcccagg gcaagttggt ggccgagtca 840 gcagcaactg ccactagtag tagcactagt aatagtcaca ttcatcagca cgacagtgac 900 agcaactcga gtgcatcact gccccaccac atcagcagca gcagcagcag caacaacaat 960 agcagtggca acagggcacg ccacattgct gctgcaagtg caagagcaac accagcagcg 1020 gccacacccg ccaactccct tgaactctac aagctgctga cccagcgggc agccaaaatg 1080 acatcgatgg actcgatggc cgcccagctg gcgcaattct cactgctggc cgacttcaat 1140 ctgatcaact cgctggccag ccaacagcag cagcagcagc agcaacagat cgctagtgcg 1200 gtaacgccaa ctacctcaga agtatctgca gccgcaatca gtcccgcact caaagataca 1260 cccagtccca gtgtggatgc accgctcgat cttagcagca aaccatcgcc gaactcatcg 1320 attagcggcg atgtgaagtc cgtcagagcc tgtgccacgc ccacgccgtc gggaagaagg 1380 gcgtacagtg aagaggatct gagccgggcc ctacaggatg tggtggccaa caagctagat 1440 gcccggaaat cggctagcca gcaccatgag cagcgctcca ttctggacaa ccggctgttc 1500 aagatgaaac accatgacca ggagcaggat catgatggcg acgagctcga ggactccaac 1560 gatgatgctg aggcggaagt ggacagcaat gcgtcgacac cggtgtatcc ggcagagttt 1620 gcaagggcac aactgcgcaa actgagccac ctgtccgagc acaatggcag cgatctgggc 1680 gaggatgtgg atcgtggatc gccgaaaatg gggcgacatc cggcctgtgg caatgccagt 1740 gccaatcagg gcgcaccggc atccattccg ctggatgcca atgtcctgct gcacactctg 1800 atgctggctg ctgggattgg tgcaatgccg aagctggatg aaacgcaaac ggtgggcgac 1860 tttatcaagg gtctgctggt ggccaacagt ggtggcataa tgaacgaggg actgctaaat 1920 ctgctgtccg ccagtcagga gaacagcaat ggcaatgcct cgctgctgct gcaacagcaa 1980 cagcatcagc aacaccatca gcaacaccat cagcagcagc agcagcagca acatgtcgcc 2040 gcctaccggc atcgcctgcc caagtcggag actccggaaa cgaactcctc gttggatccg 2100 aacgatgcca gcgaggatcc catactgaag attccgtcct tcaaggtcag cggtccggcc 2160 agcagcagca gcctgtcgcc gggcggactg gttggtggtc accaccatcc gctgaacaac 2220 aacaacagcc tcagcatcag caacaacagc aaccacagca gcaacagcca tcggaacggc 2280 agcaatcgca gcccgcattc cgcatcgccc atgctggccg cggccgtggc ccaaggtggc 2340 tactccgccg gcaacagttt gctgacctca tcctcgtcta gcatacagaa gatgatggcc 2400 agcaatatcc agcgccagat caacgaacag agtggccagg agagtctcag gaacggaaat 2460 gttagcgatt gcagcagcaa caatggcggc tcctcctcgc tgggatacaa gaagccgagc 2520 atttcggtgg ccaagatcat tggcggaacg gacacctcac ggttcggagc ctcgcccaat 2580 ctgctgtccc aacagcacca ttcggctcac cacctgaccc accagcaaca gcagcaacag 2640 ctgagcgccc aggaggcatt gggcaaggga acgcgaccaa agaggggcaa gtatcgcaac 2700 tatgaccgcg acagtttgtg gaggcggtca aggcggtgca gagagttgaa atgtcggttc 2760 atcgagcggg tagctactat gcgtaccgca ttccacactg gagtacaagg tcaaggaacg 2820 tcacctgatg cgaccgcgca agcgagagcc caagccgcag cccgatctcg tcggcctgac 2880 cggaccagcc aacaagctgc agctggacaa actgaaggcg ggaccacatg gtggctccaa 2940 gctgagcaat gccctcaaga accaaaacaa tcaggcggct gcggcggcgg cggcggcagc 3000 agcagcagcg gccgctgcca cgcccaacgg cctgaaactg ccccttttcg aggcgggtcc 3060 acaggcgtta tcctttcagc cgaacatgtt ctggccccag acgaacgcca cgaatgccta 3120 cggcctggac ttcaatcgca tcacggaggc gatgcggaat tcccaggcct ccaatcacca 3180 cggcttatga agagtgccca ggacatggtt ggagaacgtt tacgatggca tcatcaggaa 3240 gacgctgcag gtgagcaggg caatggcagt gcggcgggta atggcagcaa cggtagcaat 3300 ggcaacgggc atgggcacgg gcatggccat ggacacgccc tgctcgatca gctgctggtg 3360 aagaagaccc ccttgccgtt caccaaccat cggaacaatg actacgtcgt cacctgttcg 3420 agtgccagcg gggagagcgt aaagcggtcg ggcagtccca tgggcaacta tgcagacatc 3480 aagcgggagg ccctgagcgc cgacagcggc ggcagcagcg atgaggagca ctcggccagc 3540 cacatcaaca acaacaacag cgatttggcg cacaacaaga acaagagcgg cggcggcggc 3600 ggcggcggca atggccagac caatgggaac ggcaggagca gccggatgac gtcgcgggat 3660 gattccgaaa cggatgccag cagctttaag agcggcgaaa atggcggcca gcaaaaccac 3720 aaaatgatgg atctcaatgg cggcagagca gcagcagtca catcaagtgc gaatcggagg 3780 cggccaccgg acatcacagt cctggacacc acaccacgtc catactgcac gagaagctgg 3840 tccagatcaa ggccgagcaa gtggaccagg cggttcagtt attggagcag ccgatggacg 3900 cgaatccagc gttcgccttg gcaccgttgg tcgcccacta ctacagtttc ttggcggagg 3960 gagggggaac accaaattaa gccacgtttt ttagtagtac catacaaatc actaaataga 4020 attatatata tatatatata tatatatatt cttttataat attttatgcc agccagctga 4080 ccgatgtgcg tggtaaatgt gcgctagtct tagttaaatg tgtaatcaac tgcatagggg 4140 aaaaacaaaa ccacaggaaa tcataaataa caacaaacaa acaaacaaac aaaaataaca 4200 aaaataacaa gaaccgcaag caaagaaaca tacatttgtg ccccggagtg tacgatgtat 4260 atttttgttt cgttttgaca atcgacaaat aggcattctc ttgtacaaac tttcttaaaa 4320 gctaacaaca aaacaaatct aaaaccttaa gaccaaaaaa aacaaaaaat gaaaaaaaac 4380 gaatactgag caaaaaccaa gaaccatttt cattttgcat ttcgtttcga accgcatttt 4440 tgtgttgagc atatttttta ctgaacagta aatgaaacag tccaatggga aaatatatgt 4500 atagcagaaa tatatagcac ttacaagcca acaacttaat cgacttctgt tttggtcagg 4560 tttctggacc ttgagctgcg attttcgcac attccataag atactcttat gttccatata 4620 attgtagttt tcatacgcaa atttctagag cagttagagc cgcagctcag acagggccaa 4680 aaccaaaaaa aatgaccagg cagttgtcct cgacatagac acaatgagta taggccaaca 4740 acagcaacta caacagcaac aataactaca gcaaagagac cataacaaca acaacaacaa 4800 caacaacaac agtaacaacc ataacaagca acaacaacag caatatccga tcaataacaa 4860 caaccaacaa aacaagcaat aataatacaa gactctacaa tacaaagaaa tgaaacattg 4920 aaatagcaaa attcaaaatt caaaaatata aaccgaaaaa ccacaatcaa aaaaccaaaa 4980 caaaattatc cacaaaaatt caaccatttt ttatgatttc caaaaggagg aaaatacaaa 5040 acggaaatcc aattaaccaa agctgccttc acatttacca attaaataaa ttagtaagca 5100 aagcgagaca aagcacacaa aataataatt caaatgaaac gcaaacgcag agtaaaaagc 5160 aagaaaatca aacaatttcc gaaatatcag tcccaaatta catttttatt ttgaaaaatt 5220 ccaaaaccta agaatacaaa atattacacc ccaaaacatt caaaattatt ttcattcgga 5280 aaaaaatttc acacatattc aaataaagta atcaaaatta aagtgtttcg gtgttaaaaa 5340 aaaataaaag ggaaaattac cgaaatatat atatctgtgt ttagacttta aaacggaaat 5400 ttgaaaacaa aatttcaaga tttcggctta aaagtaaaaa gagagacaaa aaaaaaacaa 5460 aacaaatgtt tgagaacaca catttcatgt acagtcgcct aaccaccaaa agtaagaaag 5520 cataaatata taaagacttt atattactat ataccatatg atatatattt gtatatttat 5580 gtatgtgtgt gtcttcatca ctacgcgtat accctcaaac caaacacatg atcattttga 5640 gcaactaaat atatttaaat gtacttatac actctacaca ctcttttaca ggagagcaaa 5700 acatatttac acagttaaac cccccccaat ccaaatcttt ggccctcctt tcgacgtatc 5760 tacatttcgt ttgactttga aattctatct atgggtaaac aactactaac taaatgtctg 5820 cgtaaatgaa aatagatggc caattatata aatgtcccta aaaacacata ttttgtgtgc 5880 tagctagtaa gtgtcaaagg aaaaacaaaa aaaccataca aaaacgatat acaatatatt 5940 taaatgatta tgagatggtg aaaattgtcg gaaatatttg aaaatattta gcgaattata 6000 atacaagaaa cagtcaaagg tatggcaagg aaaattgtgg aaaatagcaa gcgaaatgcg 6060 tttaataatt atattgaaat catttaaagg catttaatat atttatcatt tccacgatgc 6120 gattgataaa acagtattta ttggctaatc tccccaatta catttgatgt gcataaatgt 6180 tgtggtttag aatcaaaatc aaaggtacaa aattaaaagt taaggcttaa aaatgtaaaa 6240 aaaaaattta atacaattat gaattttggt ataatagcgg agagttctgc gaacctaaag 6300 aaattcaaaa tgtttattat atgaaaaatg gaaaaatgga aggaaaaata ggcgagagta 6360 gataaagaat ggatggaaat aaatcaaaaa gtatttattg ctaatttaat tatatttgaa 6420 gtatacatac atatttatta tatacataca tatatattag acaccctgtc tgtgattaat 6480 aatccaaaat tttgaagagc attttctgaa ataacgttgg ctaagcatat gcgaaaagac 6540 aaaaccaatg gataaagtaa cacacaccca tgtaaagaaa ttgtagacag atcggataaa 6600 acgaaactaa accaagcaca agctaatggc ccaaatgcag ttggccccga aaatcgagcg 6660 ctgcatttgg ccaagagaat ttcttaagct acggcacaca tcactgaaaa caaaaactga 6720 aaactgaata ctgaatactg cgaataggaa acagtaagca gaagacaaga tcgatggtac 6780 tgttcagaac atatatagtt gtatatattt tggaataatg tttaccagtt caagtcaaaa 6840 ttaaaaggaa aaaaaatgca aagtctttta taatgcaaaa ttatacaaag aaaaattaca 6900 atttcgcaac gctaaaaaat gaaaaacgaa aatattgatg taaaaagaat gaaaatcaaa 6960 cttaaaataa taaacaagat aaagtgcaat tatacggttt aaataagcaa atttaagaaa 7020 caaacgttat aaacaaaaac acaaacatgt taaaacaatg aaaatatttc gaagcaagtt 7080 tagctacaaa ttccaaggca actgataatg acaagaacca tttacaagaa aaaccaagac 7140 agcaaagtac agtgcgtttc atgactcgca aatacgggca atagaatact agtttttcat 7200 tgcccatgga gaactgaaac gcactttggc cctcacttca tattgatagg gtaatcggat 7260 ccaaaatctg taaaccaaat tttgggatca gcgattaaaa ccttaacgga agttcataac 7320 tgcagaaaaa aaaagtcgaa agtcgaaatg tcaacctagt ggtagttcga aacacaaaaa 7380 caaaacaaac caatcgagtg taattgagtg acagcttgag aatgttgaat tgtatagaat 7440 ttttgcttgt gcacctggtc gagggggccg tggttgcgcc ccctttttgg tttctaggtg 7500 aacaggcgaa aacgctcagg tacgtgtttt atttttcgga gagaaacaag attgattacc 7560 catacattac ttattctttg ttttactaca acataatagt taatatttgt ataaaaaaaa 7620 aagacgacga tggcgagaag ggaaaaccag ctaaaaaaat tgatatattc ataatataga 7680 attgttttaa atggtttgag agcgaaaaat attgagggtt tctagcgtgc ttcatgaaat 7740 tgctcatatt tgtgtataaa accttatttg ttcaatgcga atcaaaattt gctgaatata 7800 agtgcatata tatataatta gagtttattt ttcggtttag ttaagcgcat acaatgatta 7860 cgatttaaat aattattatt agttatgacc taatagtagg caaatcaaaa tttcttcaca 7920 taaaacaatc aacttctact ttcaaataat ttctagacgt atgtaactat aggttattat 7980 tctattatta tacggcctaa aactatttag cgcgttgtta gactcgattt atggtttgta 8040 catattagac aacattttta tggtattctc ctcttttttt attattacta gcattattac 8100 tccctatttt aattgacttc ttaaatgggc aacatcattt tgaactatag tgcaattgtt 8160 tcaaacaaaa ccaagcaacc aacaacaaat atattaacat taaagattaa tataatttac 8220 aagctttctt tgccgatgcc aagaaggatg aataacgcat atgtctaccg tatggatcga 8280 aaatcaaaaa tcattttaag tgcacataac tctttaaaat agcaattaga ctacgactag 8340 gtttttccat tattatgcgc acgcttcagt ccaaaaaaaa aaaaattgaa attcttcaca 8400 ttctcattca catgctcagt tcgagacatc acaatcacaa gccagaaaaa gaaaactcaa 8460 aacttttcac ctaagttcaa cttgagcgac cgcaacaact caagatcagc aaagatcaca 8520 agcgaaaatt gctaagaaac aatagcccag acgtgataac aaccacaaaa tagccacaac 8580 aatagcccaa aaataattga aacaaaaatt gattaaaagc caaaaaaaaa ataaaaaagc 8640 aaataataag aacttaatta catgaaacta attaaataca aagagatgtc caaaagctca 8700 gataaaatcc aagcagccta aagtcgattg tacttttctt ttttctatcc acaaatacaa 8760 ccaacaacaa ccaacaacaa caacaacagc agcaaccaca atttaattaa acaatagtac 8820 tactctaact acagaactta aatagccaca agtaaataga attagccaca gcaatttaac 8880 tttataacaa gatgcccaaa cacaggacac tcacagaaac gtttcttcaa aacagatttg 8940 tactcttagc cacataaccg atacgataca atagccactt aattaggatc gatcatagcc 9000 aagcaataga atcagatatc agatacaaaa ctcagaaccg gaaacaggaa atcgtcaatc 9060 gcgaatcgga aaaaagaagc agcaccaaat cgaactgcaa aggcaaaacc ccagtatatt 9120 aataatgggg agaaacataa tgataaacca tttttatttc aattattaac ttataacaca 9180 acatcaccac cacagcttcc cacagtttca gttacacagg accaccacca tagccctagg 9240 aaattatatt cataattaat caatcaatca atccgtaaaa ccaacaactt caatagttat 9300 aagcaattgg cgtcaaaaaa aaaaaaagcg aaaacaacca aaatcttagc caggcaaaag 9360 ttttgagcat atttttcatt atttttacaa caaacaaaca aactgatgta agtacaattc 9420 ataaaattag actttcggta aactataata aagaaaatag agcagaaaat aacatttttt 9480 ttttgcatta cataattgct acaaaattca aaaacaaaga acgatttttt agtggaaaac 9540 aaaagccaat aagcaaataa aaagaat 9567 59 435 DNA Drosophila melanogaster 59 ataaagcgtg gtctgtggca cgtaattaac gaagaagaga gttaagaaga ttcaacgtcg 60 cgtcgaaggt gaacacgcag ttaaaaccca aaaacaaacg caacgcagcc aattgtgata 120 cgaaaaataa gtaaatcggc agacaatcaa caaaaacaat aaaagcaaag cctacacaca 180 ggcgtaggcg tctagccaac agctgttttt cgtgcgcgct ccgtattcag catactcgct 240 gcgcttacaa acgtcggcag agaccgcggc agagaagcaa gcggcggcag caaaactaca 300 aagcagaaac tacaattcag atcgcgtttt ttcttcggag tggatgagac gttcgttcgg 360 cagttctatt gcgcatagaa ctaacaaggt acctactccg cgaactcacc cccgcgagaa 420 aagaacagaa cagag 435 60 12026 DNA Drosophila melanogaster 60 attctggttg atcctgccag tagttatatg cttgtctcaa agattaagcc atgcatgtct 60 aagtacacac gaattaaaag tgaaaccgca aaaggctcat tatatcagtt atggttcctt 120 agatcgttaa cagttacttg gataactgtg gtaattctag agctaataca tgcaattaaa 180 acatgaacct tatgggacgt gtgcttttat taggctaaaa ccaagcgatc gcaagatcgt 240 tatattggtt gaactctaga taacatgcag atcgtatggt cttgtaccga cgacagatct 300 ttcaaatgtc tgccctatca acttttgatg gtagtatcta ggactaccat ggttgcaacg 360 ggtaacgggg aatcagggtt cgattccgga gagggagcct gagaaacggc taccacatct 420 aaggaaggca gcaggcgcgt aaattaccca ctcccagctc ggggaggtag tgacgaaaaa 480 taacaataca ggactcatat ccgaggccct gtaattggaa tgagtacact ttaaatcctt 540 taacaaggac caattggagg gcaagtctgg tgccagcagc cgcggtaatt ccagctccaa 600 tagcgtatat taaagttgtt gcggttaaaa cgttcgtagt tgaacttgtg cttcatacgg 660 gtagtacaac ttacaattgt ggttagtact atacctttat gtatgtaagc gtattaccgg 720 tggagttctt atatgtgatt aaatacttgt attttttcat atgttcctcc tatttaaaaa 780 cctgcattag tgctcttaaa cgagtgttat tgtgggccgg tactattact ttgaacaaat 840 tagagtgctt aaagcaggct tcaaatgcct gaatattctg tgcatgggat aatgaaataa 900 gacctctgtt ctgctttcat tggttttcag atcaagaggt aatgattaat agaagcagtt 960 tgggggcatt agtattacga cgcgagaggt gaaattcttg gaccgtcgta agactaactt 1020 aagcgaaagc atttgccaaa gatgttttca ttaatcaaga acgaaagtta gaggttcgaa 1080 ggcgatcaga taccgcccta gttctaacca taaacgatgc cagctagcaa ttgggtgtag 1140 ctacttttat ggctctctca gtcgcttccg ggaaaccaaa gctttttggg ctccggggga 1200 agtatggttg caaagctgaa acttaaagga attgacggaa gggcaccacc aggagtggag 1260 cctgcggctt aatttgactc aacacgggaa aacttaccag gtcgaacata agtgtgtaag 1320 acagattgat agctctttct cgaatctatg ggtggtggtg catggccgtt cttagttcgt 1380 ggagtgattt gtctggttaa ttccgataac gaacgagact caaatatatt aaatagatat 1440 cttcaggatt atggtgctga agcttatgta gccttcattc atgttggcag taaaatgctt 1500 attgtgtttg aatgtgttta tgtaagtgga gccgtacctg ttggtttgtc ccattataag 1560 gacactagct tcttaaatgg acaaattgcg tctagcaata atgagattga gcaataacag 1620 gtctgtgatg cccttagatg tcctgggctg cacgcgcgct acaatgaaag tatcaacgtg 1680 tatttcctag accgagaggt ccgggtaaac cgctgaacca ctttcatgct tgggattgtg 1740 aactgaaact gttcacgatg aacttggaat tcccagtaag tgtgagtcat taactcgcat 1800 tgattacgtc cctgcccttt gtacacaccg cccgtcgcta ctaccgattg aattatttag 1860 tgaggtctcc ggacgtgatc actgtgacgc cttgcgtgtt acggttgttt cgcaaaagtt 1920 gaccgaactt gattatttag aggaagtaaa agtcgtaaca aggtttccgt aggtgaacct 1980 gcggaaggat cattattgta taatatcctt accgttaata aatatttgta attatacaaa 2040 taaaaacaat ttaccaaaat aaaaatataa caaaatgatt ccatggaatc aaaagttaaa 2100 atcaaaataa aacgaagatg ggttttattt atatagttag tgtggggctt ggcaacctca 2160 taaaaagatt ttaacatttc taatgtatgt tgtgcgtatt tgtggcgagt acttacaaca 2220 acggcgtttc ctataaaaat aatgtttcga acatgaaaat cgaagaaaca aaattcgaaa 2280 gtggaagtcg aatcaaaata aaataatttc gaatgtgtgg taatcatcga aataagtgtt 2340 aatataattg gtagatatta actaattttt aaaatttgtg tgtatttatt actatacacg 2400 cgttgcgaat atgtattgtt catcttagtt atgggcatac gttggctaat gcaacaacct 2460 gaaataaaca atgttgtacc tggcatccat caggttaatg ttttatataa attgcagtat 2520 gtgtcaccca aaatagcaaa ccccataacc aaccagatta ttatgataca taatgcttat 2580 atgaaactaa gacatttcgc aacatttatt ttaggtatat aaatacattt attgaaggaa 2640 ttgatatatg ccagtaaaat ggtgtatttt taatttcttt caataaaaac ataattgaca 2700 ttatataaaa atgaattata aaactctaag cggtggatca ctcggctcat gggtcgatga 2760 agaacgcagc aaactgtgcg tcatcgtgtg aactgcagga cacatgaaca tcgacatttt 2820 gaacgcatat cgcagtccat gctgttatgt actttaatta attttatagt gctgcttgga 2880 ctacatatgg ttgagggttg taagactatg ctaattaagt tgcttataaa tttttataag 2940 catatggtat attattggat aaatataata atttttattc ataatattaa aaaataaatg 3000 aaaaacatta tctcacattt gaatgtgaaa aacgaagaga aatattttct ttttcaatca 3060 aataatactg agaaatgtct agcataaaaa attgaaatat ttttcatcta gaattgtctc 3120 ttattaatga ttcggaaata gaaaaatctt ggttatgtta ttattcttcg ttggttcgtt 3180 aaaaatggat aaataaaaac tttgcataca agaattaata aaaatgttat aacgaattta 3240 attaaatgtt ttatcattat atataaagaa tttatggcaa gataaagtta tatacaacct 3300 caactcatat gggactaccc cctgaattta agcatattaa ttaggggagg aaaagaaact 3360 aacaaggatt ttcttagtag cggcgagcga aaagaaaaca gttcagcact aagtcacttt 3420 gtctatatgg caaatgtgag atgcagtgta tggagcgtca atattctagt atgagaaatt 3480 aacgatttaa gtccttctta aatgaggccc gtataacgtt aatgattact agatgatgtt 3540 tccaaagagt cgtgttgctt gatagtgcag cactaagtgg gtggtaaact ccatctaaaa 3600 ctaaatataa ccatgagacc gatagtaaac aagtaccgtg agggaaagtt gaaaagaact 3660 ctgaatagag agttaaacag tacgtgaaac tgcttagagg ttaagcccga tgaacctgaa 3720 tatccgttat ggaaaattca tcattaaaat tgtaatattt aaataatatt atgagaatag 3780 tgtgcatttt ttccatataa ggacattgta atctattagc atataccaaa tttatcataa 3840 aatataactt atagtttatt ccaattaaat tgcttgcatt ttaacacaga ataaatgtta 3900 ttaatttgat aaagtgctga tagatttata tgattacagt gcgttaattt ttcggaatta 3960 tataatggca taattatcat tgatttttgt gtttattata tgcacttgta tgattaacaa 4020 tgcgaaagat tcaggatacc ttcgggaccc gtcttgaaac acggaccaag gagtctaaca 4080 tatgtgcaag ttattgggat ataaacctaa tagcgtaatt aacttgacta ataatgggat 4140 tagtttttta gctatttata gctaattaac acaatcccgg ggcgttctat atagttatgt 4200 ataatgtata tttatattat ttatgcctct aactggaacg taccttgagc atatatgctg 4260 tgacccgaaa gatggtgaac tatacttgat caggttgaag tcaggggaaa ccctgatgga 4320 agaccgaaac agttctgacg tgcaaatcga ttgtcagaat tgagtatagg ggcgaaagac 4380 caatcgaacc atctagtagc tggttccttc cgaagtttcc ctcaggatag ctggtgcatt 4440 ttaatattat ataaaataat cttatctggt aaagcgaatg attagaggcc ttagggtcga 4500 aacgatctta acctattctc aaactttaaa tgggtaagaa ccttaacttt cttgatatga 4560 agatcaaggt tatgatataa atgtcccagt gggccacttt tggtaagcag aactggcgct 4620 gtgggatgaa ccaaacgtaa tgttacgtgc ccaaattaac aactcatgca gataccatga 4680 aaggcgttgg ttgcttaaaa cagcaggacg gtgatcatgg aagtcgaaat ccgctaagga 4740 gtgtgtaaca actcacctgc cgaagcaact agcccttaaa atggatggcg cttaagttgt 4800 atacctatac attaccgcta aagtagatga tttatattac ttgtgatata aattttgaaa 4860 ctttagtgag taggaaggta caatggtatg cgtagaagtg tttggcgtaa gcctgcatgg 4920 agctgccatt ggtacagatc ttggtggata gtagcaaata atcgaatgag agccttggag 4980 gactgaagtg gagaagggtt tcgtgtgaac agtggttgat cacgagttag tcggtcctaa 5040 gttcaaggcg aaagcgaaaa ttttcaagta aaacaaaaat gcctaactat ataaacaaag 5100 cgaattataa tacacttgaa taattttgaa cgaaagggaa tacggttcca attccgtaac 5160 ctgttgagta tccgtttgtt attaaatatg ggcctcgtgc tcatcctggc aacaggaacg 5220 accataaaga agccgtcgag agatatcgga agagttttct tttctgtttt atagccgtac 5280 taccatggaa gtctttcgca gagagatatg gtagatgggc tagaagagca tgacatatac 5340 tgttgtgtcg atattttctc ctcggacctt gaaaatttat ggtggggaca cgcaaacttc 5400 tcaacaggcc gtaccaatat ccgcagctgg tctccaaggt gaagagtctc tagtcgatag 5460 aataatgtag gtaagggaag tcggcaaatt agatccgtaa cttcgggata aggattggct 5520 ctgaagattg agatagtcgg ccttgattgg gaaacaataa catggtttat gtgctcgttc 5580 tgggtaaata gagtttctag catttatgtt agttacttgt tccccggata gtttagttac 5640 gtagccaatt gtggaacttt cttgctaaaa tttttaagaa tactatttgg gttaaaccaa 5700 ttagttctta ttaattataa cgattatcaa ttaacaatca attcagaact ggcacggact 5760 tggggaatcc gactgtctaa ttaaaacaaa gcattgtgat ggccctagcg ggtgttgaca 5820 caatgtgatt tctgcccagt gctctgaatg tcaaagtgaa gaaattcaag taagcgcggg 5880 tcaacggcgg gagtaactat gactctctta aggtagccaa atgcctcgtc atctaattag 5940 tgacgcgcat gaatggatta acgagattcc tactgtccct atctactatc tagcgaaacc 6000 acagccaagg gaacgggctt ggaataatta gcggggaaag aagacccttt tgagcttgac 6060 tctaatctgg cagtgtaagg agacataaga ggtgtagaat aagtgggaga tattagacct 6120 cggtttggta tcgtcaatga aataccacta ctcttattgt ttccttactt acttgattaa 6180 atggaacgtg tatcatttcc tagccattat acggatatat ttattatatc ttatggtatt 6240 gggttttgat gcaagcttct tgatcaaagt atcacgagtt tgttatataa tcgcaaacaa 6300 attctttaat aaaacgatgc atttatgtat ttttgatttg aaaatttggt ataactccaa 6360 ttactcaggt atgatccaat tcaaggacat tgccaggtag ggagtttgac tggggcggta 6420 catctctcaa ataataacgg aggtgtccca aggccagctc agtgcggaca gaaaccacac 6480 atagagcaaa agggcaaatg ctgacttgat ctcggtgttc agtacacaca gggacagcaa 6540 aagctcggcc tatcgatcct tttggtttaa agagttttta acaagaggtg tcagaaaagt 6600 taccataggg ataactggct tgtggcggcc aagcgttcat agcgacgtcg ctttttgatc 6660 cttcgatgtc ggctcttcct atcattgtga agcaaaattc accaagcgtt ggattgttca 6720 cccatgcaag ggaacgtgag ctgggtttag accgtcgtga gacaggttag ttttacccta 6780 ctaatgacaa aacgttgttg cgacagcatt cctgcgtagt acgagaggaa cccgcaggta 6840 cggaccaatg gcacaatact tgttcgagcg aacagtggta tgacgctacg tccgttggat 6900 tatgcctgaa cgcctctaag gtcgtatccg tgctggactg caatgataaa taaggggcaa 6960 tttgcattgt atggcttcta aaccatttaa agtttataat ttactttata aacgacaatg 7020 gatgtgatgc caatgtaatt tgtaacatag taaattggga ggatcttcga tcacctgatg 7080 ccgcgctagt tacatataaa agcattattt aatacaatga caaagcctag aatcaattgt 7140 aaacgacttt tgtaacaggc aaggtgttgt aagtggttga gcagctgcca tactgcgatc 7200 cactgaagct tatcctttgc ttgatgattc gatataaaat aaatggttgc caaacagctc 7260 gtcatcaatt tagtgacgca ggcatatgat attgtgtccc tatcatataa ttttaatata 7320 aagaatttaa agaattttat caagagtagc caaacacctc gtcatcaatt tagtgacgca 7380 tatgatattg tccctatcat ataattaata taaagaattt aaagaatttt atcaagagta 7440 gccaaacacc tcgtcatcaa tttagtgacg catatgatat tgtccctatc atataattaa 7500 tataaagaat ttaaagaatt ttatcaagag tagccaaaca cctcgtcatc aatttagtga 7560 cgcatatgat attgtcccta tcatataatt aatataaaga attttatcaa gagtagccaa 7620 agacctcgtc atcaatttag tgacgcgtat gatattgtcc ctatcatata attaatatat 7680 aaagaattta aagaatttta tcaagagtag ccaaacacct cgtcattaac tactataata 7740 ggtaggcagt ggttgccgac ctctcatatt gttcaaaacg tatgtattca tatgattttg 7800 gcaattatat gagtaaatta aatcatatac atatgaaaaa ggcagtggtt gccgacctct 7860 catattgttc aaaacgtatg tgttcatatg attttggcaa ttatatgagt aaattaaatc 7920 atatacatat gaaaattaat atttattata tgtataagtg aaaaatattg aaatattccc 7980 atattctcta agtattatag agaatataat taatatataa agaattttat caagagtagc 8040 caaacacctc gtcattaact actataatag gtaggcagtg gttgccgacc tctcatattg 8100 ttcaaaacgt atgtattcat atgattttgg caattatatg agtaaattaa atcatataca 8160 tatgaaaaag gcagtggttg ccgacctctc atattgttca aaaacgtatg tgttcatatg 8220 attttggcaa ttatatgagt aaattaaatc atatacatat gaaaattaat atttattata 8280 tgtataagtg aaaaatattg aaatattccc atattctcta agtattatag agaatataat 8340 taatatataa agaattttat caagagtagc caaacacctc gtcattaact actaaaatag 8400 gtaggcagtg gttgccgacc tctcatattg ttcaaaacgt atgtattcat atgattttgg 8460 caattatatg agtaaattaa atcatataca tatgaaaaag gcagtggttg ccgacctctc 8520 atattgttca aaaacgtatg tgttcatatg attttggcaa ttatatgagt aaattaaatc 8580 atatacatat gaaaattaat atttattata tgtataagtg aaaaatattg aaatattccc 8640 atattctcta agtattatag agaatataat taatatataa agaattttat caagagtagc 8700 caaacacctc gtcattaact actataatag gtaggcagtg gttgccgacc tctcatattg 8760 ttcaaaacgt atgtatcata tgattttggc aattatatga gtaaattaaa tcatatacat 8820 atgaaaaagg cagtggttgc cgacctctca tattgttcaa aacgtatgtg ttcatatgat 8880 tttggcaatt atatgagtaa attaaatcat atacatatga aaaaggcagt ggttgccgac 8940 ctctcatatt gttaaaaacg tatgtgttca tatgattttg gcaattatat gagtaaatta 9000 aatcatatac atatgaaaat taatatttat tatgtgtata agtgaaaaat gttgaaatat 9060 tcccatattc tctaagtatt atagagaaaa gccatttagt gaatggatat agtagtgtaa 9120 gctagctgtt ctacgacaga gggttcaaaa actactatag gtaggcagtg gttgccgacc 9180 tctcatattg ttcaaaacgt atgtgttcat atgattttgg caattatatg agtaaattaa 9240 atcatataca tatgaaaatt aatatttatt atgtgtatag tgaaaaatgt tgaaatattc 9300 ccatattctc taagtattat agagaaaagc cattttagtg aatggatata gtactgtaag 9360 ctagctgttc tacgacggag ggttcaaaaa ctactatagg taggcagtgg ttgccgacct 9420 ctcatattgt tcaaaacgta tgtgttcata tgattttggc aattatatga gtaaattaaa 9480 tcatatacat atgaaaatta atatttatta tgtgtataag tgaaaaatgt tgaaatattc 9540 ccatattctc taagtattat agagaaaagc cattttagtg aatggatata gtgctgtaag 9600 ctagctgttc tacgacagag ggttcaaaaa ctactatagg taggcagtgg ttgccgacct 9660 ctcatattgt tcaaaacgta tgtgttcata tgattttggc aattatatga gtaaattaaa 9720 tcatatacat atgaaaataa atatttatta tatgtatatg gaaaaatgtt gaaatattcc 9780 catattctct aagtattata gagaaaagcc attttagtga atggatatag tactgtaagc 9840 tagctgtttt acgacagagg gttcaaaaac tactataggt aggcagtggt tgccgacctc 9900 tcatattgtt caaaacgtat gtgttcatat gattttggca attatatgag taaattaaat 9960 catatacata tgaaaataaa tatttattat atgtatatgg aaaaatgttg aaatattccc 10020 atattctcta agtattatag agaaaagcca ttttagtgaa tggatatagt agtgtaagct 10080 agctgttcta cgacagaggg ttcaaaaact actataggta ggcagtggtt gccgacctct 10140 catattgttc aaaacgtatg tgttcatatg attttggcaa ttatatgagt aaattaaatc 10200 atatacatat gaaaataaat atttattata tgtatatgga aaaatgttga aatattccca 10260 tattctctaa gtattataga gaaaagccat tttagtgaat ggatatagta gtgtaagcta 10320 gctgttctac gacggagggt tcaaaaacta ctataggtag gcagtggttg ccgacctctc 10380 atattgttca aaacgtatgt gttcatatga ttttggcaat tatatgagta aattaaatca 10440 tatacatatg aaaataaata tttattatat gtatatggaa aaatgttgaa atattcccat 10500 attctctaag tattatagag aaaagccatt ttagtgaatg gatatagtag tgtaagctag 10560 ctgttttacg acagagggtt caaaaactac tataggtagg cagtggttgc cgacctctca 10620 tattgttcaa aacgtatgtg ttcatatgat tttggcaatt atatgagtaa attaaatcat 10680 atacatatga aaatgaatat ttattatatg tatatagggg aaaaaataat catataatat 10740 atatgaataa tggaaaatga agtgttcata tattctcgta atatataaga gaatagcccg 10800 tatgttgggt ggtaaatgga attgaaaata cccgctttga ggacagcggg ttcaaaaact 10860 actataggta ggcagtggtt gccgacctcg cattgttcga aatatatatt tcgtataatg 10920 attatattgg ttacttataa taaagtatat tattatccgt acaaatttgt ttctcagttc 10980 tttttgaaca cgggacttgg ctccgcggat aataggaata tacgctattt tagataatat 11040 cgttgaaaca aaagtcaagt ttctattata catagaataa caaatcgttt ccatatatta 11100 tcgttaattt ttggtggcag gcaaatatta gtttattacc tgcctgtaaa gttggattat 11160 tatatcgtta cggtataata caaaatggat tcatattatt atatgaaaga aatataaaat 11220 ttatatataa atttggaaga attatcatgt gcgctcggtt ttatgttata tattaccaga 11280 gagttatatg aaaagagata aattttaaat ttatcatcaa gatgcaaatg atttaactta 11340 tatttggtta aacaaaaatt gtacaagtgt ggatacaaaa tttatgtatg ttggaaataa 11400 aatgatattt tagaatgaaa tatatgtata tataaagaca aaattataga aaatatatta 11460 caataattgt atgatcttct tgttatattg gtaaaacaag tagaatttaa aaatgggaat 11520 acgaattacg agtgctatat aaaaatggcc gtattcgaat ggatttattt ttataaatat 11580 atttaaaatt tttacccaaa ggcaaaatat tgaattacat tcaataatat aaaaaaatgg 11640 aattatatat aaagtggaaa atctataata tttatattgc ttatttcaat tcaaaaaata 11700 tgaatgaaat atgaaaagaa aacattattc tggttgatcc tgccagtagt tatatgcttg 11760 tctcaaagat taagccatgc atgtctaagt acacacgaat taaaagtgaa accgcaaaag 11820 gctcattata tcagttatgg ttccttagat cgttaacagt tacttggata actgtggtaa 11880 ttctagagct aatacatgca attaaaacat gaaccttatg ggacgtgtgc ttttattagg 11940 ctaaaaccaa gcgatcgcaa gatcgttata ttggttgaac tctagataac atgcagatcg 12000 tatggtcttg taccgacgac agatct 12026 61 5966 DNA Drosophila melanogaster 61 gactttatcc ttgttcttgt ttttcgccac cctgccctcg gtgatctgct tctccagatt 60 cacattttta atcgttgcca cgttggcctt cttcggctta cccattttgc tcggatattg 120 tcaaaaatca agtgtaataa taagggcaac aaaaattgta atcactacac gtgctccaac 180 aaaccagtgc gttaccggat attttgaaag aatcgaaaaa caaaacacat tttcgcacgt 240 gtggcaacgt cgcagggctg gccttttcga taggccttgt gagaccaggg atggtggcgc 300 aaacgatatt gaatgcagta attcaataaa atctatatat attgtttttg aagtttatta 360 tgaaatcgat tgtaccgata taattcgttt caagatgtat ttaaaaagtc gcataaataa 420 atatttagaa cgcacaatat agcatgtttg atgaatcagt gcaattatct ccattctcgt 480 gtattattaa taacgagtgt tattttcgta cttatttggc ataattctta taattttaat 540 taatacaaac aatttagtga tcttcatagt tttaataaac tcatttactt tttaatattt 600 gttatgtctc agttttaatg ttatagtgaa ggttggatat ataaatccac ttcatatttt 660 ggtacgtttt tacatcgaaa gaaattatta aatatcgccg tgagcacgaa ctatcgatat 720 gcggcggtac agctggtcga tctgtcatct ctacttcctc ctatcggtct ttttcaaccc 780 tgcgctcgca gccaacagac ctccgccgtt tctttttttt tctgcgttgc gtgcaaacag 840 accgacaata tgaaggtgtg ctaaaaattt gtgataatac tgatataaat tatatcaatt 900 tgtgaaatat tgcgaaaatg ctaaagcgca agcattggcg cggtgatcaa acaaattgca 960 agcgcaaaat gtccagtgat tgcgttaaaa taagtggatg atattatctc cgctgcaatt 1020 cgacttaata tacatttata tgtatgttta tgcattgtgt gtgacttttt tttttgcatt 1080 gtgtgttcca gctcaacgtt tcctatcccg cgacgggatg ccaaaagcta ttcgaagtgg 1140 tcgacgagca caagctgcgc gtcttctacg agaagcgtat gggacaggtt gtggaggccg 1200 atatcctcgg tgacgagtgg aagggctacc agctgcgcat cgccggcggc aacgacaagc 1260 agggattccc catgaagcag ggtgtcttga cccacggtaa gtttgcaatg cgatttacaa 1320 ttatggtgtt gagcagtgaa tgattgtatc tggctagccg gcataggttt agtcttttgt 1380 acgccttgaa agagtggttc ctattaagaa acgtatggaa atccttttaa gcatttgtaa 1440 taaagcctag caacctggat cagtcttgtg ataatatatc taaaagaaca taaccttaat 1500 atggaagcgt ctgtgtatac ttctgtgcat attcgtggcg agtttttcga ttcttagtaa 1560 tacaaagcga atatttccaa caatttagtt gcaacgttgc tagcagctag taatttacta 1620 tatgtgattg gagtacgttc caaagatgag caaacaatta gtaggataaa tttccttaag 1680 tatttgcaaa cacatttcct tacacccgaa tggctaatgg ctaaccagat aataactttc 1740 caatcactgc tcatatccat ggactgctct caagaaacta ctacaaaaag caatcatctt 1800 ttccaacagg ccgtgtgcgt ctgctcctga agaagggaca ctcctgctac cgtccacgcc 1860 gcactggcga gcgtaagcgc aagtctgtgc gtggatgcat cgtggacgcc aacatgtctg 1920 tgctggctct ggtcgtcttg aagaagggtg agaaggacat tcccggtctc accgacacca 1980 ccatcccacg tcgcctggga cccaagcgtg ctagcaagat ccgcaagctc tacaacttga 2040 gcaaggaaga tgatgtgcgt cgcttcgttg tgcgtcgccc tttgcccgcc aaggacaaca 2100 agaaggccac ctccaaggcc cccaaaattc agcgcctgat cacccccgtt gtgctgcagc 2160 gcaagcaccg tcgcattgcg ctgaagaaga agcgccagat cgcttccaag gaggcttccg 2220 ccgactacgc caagctgttg gtgcagcgca agaaggagtc caaggccaag cgcgaggagg 2280 ccaagcgccg ccgttctgcc tccattcgcg agtccaagag ctctgtctcc agcgacaaga 2340 agtaaacacc gcgacacgga acccacattc ttcgtttaag cagaaactga acgttgatca 2400 caaccaaatg atccacgaga ggagaaataa aactttgtaa ccatttatgt taaataaacg 2460 cgataaatgg tatagaatgt ggtgtgttgt gttcaattat attgatcgat tgagcgagtg 2520 taagctttag gatatatata ctaaagtatc gggtgtaaat gttttcatat tgcatgtgat 2580 tttggttatg agctagccag cggcattttt tggtaaggaa accatggaaa acaaacacta 2640 tcttggaatt ccaacataag ccattatatg gccaaatcgg aatcggaaag agatggaaac 2700 aggaaatcgt gagcggatat atgtagctat ctcttgcgat ttgctgcagt ccccgccaaa 2760 acaataatgg tcccatgtgt agatttctat gttccgtatg gaaacccttt taagcatctg 2820 taataagcca agcaacctgg gtatcttctg tcatgaggtt gcttgtcttc ttctgtcatt 2880 agccaagcaa cctggatcag tcttgtgata atatatctag agaacatacc ttaaatatga 2940 aatcgtctgt gtattcttat gtgcatattc gtggtgagtt cttccattct gagcaataca 3000 aagcgaatat ttccagttgc taaaagctaa taattaacta tatgtgattg gagtacgctc 3060 caaagttgag catccaattg gtataataaa tttccttaag tatttgcaaa cacatccatc 3120 catggactcc tctcaagaaa ctacttaaaa aaacaatcat cttttccaac aggccgtttg 3180 cgtctcctga agaagataca ctcgtgcttc catccacgct gcaataaagt gcgcaagtgc 3240 aagactgtgc gtaaatacac cgtggaagcc aacgtatccg cgctgacttt ggtcgttctc 3300 aagaagaatc cctccccatg tcgcctggga cccgtgcgtt ccagcaacat cagcaagatc 3360 tactatttgt gcgaggaaga tgatgaggtt atccctgtta agctgcagcg caggcaccag 3420 aagaagcgcc agaatgcaac caaggaggct atcgccgaat acgtcaaact gttggtcaag 3480 cgcaagaagg agtccaaggc caaccgcggc cgttatgtca ccattcgcaa gccgaaaagc 3540 tctgtcttca gcggcaagaa gtaaataccg cagactctga acccacattc ttcgtttaag 3600 cagaaactga atgttgatca caataaccat ttttgttaaa taaacgcgat aaatggtata 3660 gaatgtggtg tgttgtgttc aattatattg atcgattgag cgagtgtaag ctttaggata 3720 tatatactaa agtatcgggt gtaaatgttt tcatattgca tgtgattttg gttatgagct 3780 agccagcggc attttttggt aaggaaacca tggaaaacaa acaaacagta tcttggaatt 3840 ccaacataag ccattatatg ggcaaatcgg aatcggaaag agatggaaac aggaaatcgt 3900 gagcggatat atgtagctat ctattgcgat ttgctgcagt ccccgctaaa gcaataatgg 3960 tcccatgtgt agatttctat gttccgtatg gaaacccttt taagcatctg taataagcta 4020 agcaacctgg gtatcttctg tcatgaggtt gcttgtcttc ttctgtcatt agccaagcaa 4080 cctggatcag tcttgtgata atatatctag agaacatacc ttaaatatga aatcgtctgt 4140 gtattcttat gtgcatattc gtggtgagtt cttccattct gagcaataca aagcgaatat 4200 ttccagttgc taaaagctaa taattaacta tatgtgattg gagtacgctc caaagttgag 4260 catccaattg gtataataaa tttccttaag tatttgcaaa cacatccatc catggactcc 4320 tctcaagaaa ctacttaaaa aaacaatcat cttttccaac aggccgtttg cgtctcctga 4380 agaagataca ctcgtgcttc catccacgct gcaataaagt gcgcaagtgc aagactgtgc 4440 gtaaatacac cgtggaagcc aacgtatccg cgctgacttt ggtcgttctc aagaagaatc 4500 cctccccatg tcgcctggga cccgtgcgtt ccagcaacat cagcaagatc tactatttgt 4560 gcgaggaaga tgatgaggtc atccctgtta agctgcagcg caggcaccag aagaagcgcc 4620 agaatgcaac caaggaggct atcgccgaat acgtcaaact gttggtcaag cgcaagaagg 4680 agtccaaggc caaccgcggc cgttatgtca ccattcgcaa gccgaaaagc tctgtcttca 4740 gcggcaagaa gtaaataccg cagactctga acccacattc ttcgtttaag cagaaactga 4800 atgttgatca caataaccat ttttgttaaa taaacgcgat aaatggtata gaatgtggtg 4860 tgttgtgttc aattatgtat attgagcgat tgagctaatt aagctttagg atatctattc 4920 taatgtatcg ggtgccagcg gaaacgctga aaaatatttt tgatttttaa ttaattattt 4980 caatctattt ttcccccagg taaatgcaac agttccgaca aatagttcaa ttaaagtaag 5040 gccatttatt attttctgag aaaggaaggt cgaaaaactg tttaaatggt ttttttttta 5100 caacattacg atttacccaa tcataggtat attaatattg ttaaacaaat aaacgcagaa 5160 caaacactat cttcgaattc caacataagc cattatatgg gcaaatcgga atcggaaaga 5220 gatggaaaca ggaaatcgtg agccgatata tgtagctatc tcttgcgatt tgctgcagtt 5280 cccgccaaaa caataatgga ttcgtttaaa ttctgatcta gacatatatt ctatacgtaa 5340 gttaaataaa tgtaaaacaa gtagaatgtg tgtatataag atattttaac gtttttaggg 5400 cagtcgagca atttccagtt gtaacatgta cgtggctagc aactcatccg cgttacgatt 5460 tctacctgct ggcggtgcca cttgggatgg ggcttcctcc gccagctgcc ggcacgcttg 5520 gagctacatt ggaagtcgga agggtattag tcgagcccgc ttcttcgtat cgcgttagct 5580 ttgcctcctg atcggccagc agttcgagca aatcttcttg atccttttgc atcgccgcca 5640 gggaatcttg cgccgcatct gcactctgtt tttcggctga taactgctga cgtagtgtct 5700 ccagctcctt agttagacga atgttttcgg caaagtacat gttggcctgg tagcgagcgg 5760 catttagctc ctcctcgttg ggcggcgtgg cttccgcacc ttgggcggac tgcacctgac 5820 cggtcgaggc tcccagctgt gccttaagca gcgtgttctg gtccaagagc tgcgccttga 5880 gcgactgctc ctcaccaagt ttctccttca gctcggcatt ctcctgttcc agttctttgg 5940 aggattgctg cagtgcttgg atctgg 5966 62 187 DNA Homo sapiens 62 cttgccatag gtccaaggat cttataaagg aagctatcct tgacaatgac tttatgaaga 60 acttggagct gtcgcagatc caggagattg tggattgtat gtacccggtg gagtatggca 120 aggacagttg catcatcaaa gaaggagacg tggggtcact ggtgtatgtc atggaaggta 180 cggtttg 187 63 1682 DNA Homo sapiens 63 cctgctgagg ccaagctcgg atccggtgcc gagccaagcg gggccgtgcg tcgccggcct 60 tcgctcgcgt gacctccgcc gtcctcccca accctcgtcc tctgcgcctg cggccgcagc 120 cccagcgccc ctcgcctaac ctcccgccgg gccgcgcctc ctcctcctcc tgctccccgc 180 cgcttccgtt tctcgaggga aaggctgctg cctcctgctc tgtcctcatc cccggcttag 240 ctgacggccc agaggtgggt gccaattcca ccagcagctg caactgaaaa gcaaggttca 300 gaaatgtcag atatcctccg ggagctgctc tgtgtctctg agaaggctgc taacattgcc 360 cgggcgtgca gacagcagga agccctcttc cagctgctga tcgaagaaaa gaaagaggga 420 gaaaagaaca agaagtttgc agttgacttc aagactctgg ctgatgtact ggtacaggaa 480 gttataaaac agaatatgga gaacaagttt ccaggcttgg aaaaaaatat ttttggagaa 540 gaatccaatg agtttactaa tgactggggg gaaaagatta ccttgaggtt gtgttcaaca 600 gaggaagaaa cagcagagct tcttagcaaa gtcctcaatg gtaacaaggt agcatctgaa 660 gcattagcca gggttgttca tcaggatgtt gcctttactg acccaactct ggattccaca 720 gagatcaatg ttccacagga cattttggga atttgggtgg accccataga ttcaacttat 780 cagtatataa aaggttctgc tgacattaaa tccaaccagg gaatcttccc ctgtggactt 840 cagtgtgtca ccattttaat tggtgtctat gacatacaga caggggttcc cctgatggga 900 gtcatcaatc aaccttttgt gtcacgagat ccaaacaccc tcaggtggaa aggacagtgc 960 tattggggcc tttcttacat ggggaccaac atgcattcac tacagctcac catctctaga 1020 agaaacggca gtgaaacaca cactggaaac accggctctg aggcagcatt ctcccccagt 1080 ttttcagccg taattagtac aagtgaaaag gagactatca aagctgcatt gtcacgtgtg 1140 tgtggagatc gcatatttgg ggcagctggg gctggttata agagcctatg tgttgtccaa 1200 ggcctcgttg acatttacat cttttcagaa gataccacat tcaaatggga ctcttgtgct 1260 gctcatgcca tactgcgggc catgggtggg ggaatagtag acttgaaaga atgcttagaa 1320 agaaatccag aaacagggct tgatttgcca cagttggtgt accacgtgga aaatgagggt 1380 gctgctgggg tggatcggtg ggccaacaag ggaggactca ttgcatacag atccaggaag 1440 cggctggaga cattcctgag cctcctggtc caaaacctgg cacctgcaga gacgcatacc 1500 tagaggaact ctaaccccgg tgtacctgta taaactgaac tgtgaaactg tttcggttat 1560 ctctgtcttt tgaggatggc tttgtcctgt tgctggttaa cattcacctt cctcttttga 1620 ggagtatttt tccattatgt attcataata atgttaattt caataaatga cattcatgca 1680 gc 1682 64 8671 DNA Homo sapiens 64 cgggagagaa agcgcacgcc gagaggaggt gtgggtgttc cgcttccatc ctaacggaac 60 gagctccctc ttcgcggaca tgggattacc cagcggctgc taacccctct cctcgccctg 120 ctcccccaaa ccggcgtggc tccccgggca ccaaggagct gactacagag gagcaggatt 180 tgcacccctc gctgggcttg ctttggcaac agagtgcctg acccaggtca ggattttcaa 240 gaaagacatg tctgacaaaa tgtctagctt cctacatatt ggagacattt gttctctgta 300 cgcggaggga tcgacaaatg gatttattag caccttgggc ctggttgatg atcgttgtgt 360 tgtacagcca gaaaccgggg accttaacaa tccacctaag aaattcagag actgcctctt 420 taagctatgt cccatgaacc gctactctgc ccaaaagcag ttctggaaag ccgctaagcc 480 tggggccaac agcaccacag acgcagtgct actcaacaaa ctgcaccacg ctgcagactt 540 ggaaaagaag cagaatgaga cagaaaacag gaaattgctg gggaccgtaa tccagtatgg 600 caatgtgatc cagctcctgc atttgaaaag taataaatac ctaacagtga ataagaggct 660 tcctgctctg ttggagaaga atgccatgag agtcacattg gacgaggctg gaaatgaagg 720 gtcctggttt tatattcagc cattctacaa gctgcgatcc attggagaca gcgtggtcat 780 aggtgacaag gtggttctga accccgtcaa tgctggtcag cccctacatg ctagcagcca 840 tcaactggta gataacccag gctgcaatga ggtcaattcc gtcaactgca atacaagctg 900 gaaaatagtc cttttcatga aatggagtga taacaaagac gacatattaa aggggggtga 960 cgtggtgagg ctgtttcatg ctgagcagga gaagtttctc acctgtgacg aacacaggaa 1020 gaagcagcac gtcttcctga gaaccacggg ccggcagtcg gccacatctg ccaccagttc 1080 aaaagccctg tgggaggtgg aggtggtcca gcatgaccca tgtcggggcg gagcagggta 1140 ttggaacagc cttttccgtt tcaagcatct ggccacgggg cattacttgg cagcagaggt 1200 agaccctgac tttgaggaag aatgcctgga gtttcagccc tcagtggacc ctgatcagga 1260 cgcctctcga agtaggttgc ggaatgccca agaaaagatg gtatactccc tggtctctgt 1320 gcctgaaggc aatgacatct cctccatttt cgagctagat cccaccactc tgcgtggagg 1380 tgacagcctt gtcccaagga actcttatgt tcggctcaga cacctatgta ctaatacctg 1440 ggttcacagc acaaatattc ctattgacaa ggaagaagaa aagcccgtga tgctgaaaat 1500 tggcacctct cctgtgaagg aggataagga agcatttggc atagttccgg tttctcctgc 1560 tgaagttcgg gacctggact ttgccaatga tgccagcaag gtgctgggct ccattgctgg 1620 gaagctagag aagggcacca tcacccagaa tgaaaggagg tctgtaacca agctgctaga 1680 agatttggtt tacttcgtca ctggtggaac taattctggt caagatgttc tcgaagttgt 1740 cttctccaag cccaacagag aacggcagaa actgatgaga gaacagaata ttctcaagca 1800 gatcttcaag ttgttacaag ccccattcac agactgcggt gatggcccaa tgcttcggct 1860 ggaagagctc ggggaccagc ggcacgctcc tttcagacac atctgccggc tctgctacag 1920 ggtgctgaga cactcgcagc aagactacag gaagaaccag gagtatatag ccaagcagtt 1980 tggcttcatg cagaagcaga ttggctatga tgtgttggct gaagacacta tcactgccct 2040 gctccacaat aatcggaaac tcctggaaaa acacattacc gcggcagaga ttgacacatt 2100 tgtcagcctg gtgcgaaaga acagggagcc cagattctta gattacctct ccgacctctg 2160 tgtctccatg aacaaatcaa ttccagtgac ccaggaactg atatgtaaag ctgtgctgaa 2220 ccccaccaac gctgacatcc tgattgagac caaattggtt ctttctcgtt ttgaatttga 2280 aggtgtctct tccactggag agaatgctct ggaggcagga gaagacgagg aagaggtgtg 2340 gctgttttgg agggacagca acaaagagat tcgcagcaag agtgtgaggg aattggctca 2400 ggatgctaaa gaagggcaga aggaggaccg agacgttctc agctactaca gatatcagct 2460 gaacctcttt gcgaggatgt gtctggaccg ccaatacctg gccatcaacg aaatctcagg 2520 ccagctggat gtcgatctca ttctccgctg catgtctgac gagaacctgc cctatgacct 2580 cagggcgtcc ttctgccgcc tcatgcttca catgcatgtg gaccgagatc cccaggaaca 2640 agtcaccccc gtgaaatatg cccgcctctg gtcggagatt ccctcggaga tcgccattga 2700 cgactatgat agtagtggag cttccaaaga tgaaattaag gagagatttg ctcagaccat 2760 ggagtttgtg gaggagtatt taagagatgt ggtttgtcag aggttccctt tctctgataa 2820 agagaagaat aagcttacgt ttgaggttgt aaatttagct aggaatctca tatactttgg 2880 tttctacaac ttctctgacc ttctacgatt aactaagatc cttctggcca tattggactg 2940 tgtacatgtg acaacaatct tccccattag caagatggcg aaaggagaag agaataaagg 3000 cagtaacgtg atgagatcta ttcatggcgt gggagagctg atgacccagg tggtgctccg 3060 gggaggaggc tttttgccca tgactcccat ggctgctgcc cctgaaggca atgtgaagca 3120 ggcagagcct gagaaggagg acatcatggt catggacacc aagctgaaga tcattgagat 3180 actccagttt attttgaatg tgaggttgga ttataggatc tcctgcctcc tgtgtatatt 3240 taagcgagag ttttggatga aagcaattcc caggacttca gaaacatcct ccggaaacag 3300 cagccaagaa gggccaagta atgtaccagg tgctcttgac tttgaacaca ttgaagaaca 3360 agcagaaggc atctttggag gaagtgagga gaacacccca ctggacttgg atgaccacgg 3420 cggcagaacc tttctccgtg tcctgctcca cttgacgatg catgactacc cacccctggt 3480 gtcaggggcc ctgcagctcc tcttccggca cttcagccag aggcaggagg tgctccaggc 3540 cttcaaacag gttcaactgc tggttaccag ccaagatgtg gacaactaca aacagatcaa 3600 acaagacttg gatcaactga ggtccatcgt ggaaaagtca gagctttggg tgtacaaagg 3660 gcagggcccc gatgagacta tggatggtgc atctggagaa aatgaacata agaaaacgga 3720 ggagggaaat aacaagccac aaaagcatga aagcaccagc agctacaact acagagtggt 3780 caaagagatt ttgattcggc ttagcaaact ctgtgttcaa gagagtgcct cagtgagaaa 3840 gagcaggaag cagcaacagc gtctgctccg gaacatgggc gcgcacgccg tggtgctgga 3900 gctgctgcag attccctatg agaaggccga agataccaag atgcaagaga taatgaggtt 3960 ggctcatgaa tttttgcaga atttctgcgc aggcaaccag cagaatcaag ctttgctaca 4020 taaacacata aacctgtttc tcaacccagg gatcctggag gcagtaacca tgcagcacat 4080 cttcatgaac aatttccagc tttgcagtga gatcaacgag agagttgttc agcacttcgt 4140 tcactgcata gagactcacg gtcggaatgt ccagtatata aagttcttac agacaattgt 4200 caaggcagaa gggaaattta ttaaaaaatg ccaagacatg gttatggccg agctggtcaa 4260 ttcgggagag gatgtcctcg tgttctacaa cgacagagcc tctttccaga ctctgatcca 4320 gatgatgcgg tcagaacggg atcggatgga tgagaacagc cctctcatgt accacatcca 4380 cttggtcgag ctcctggctg tgtgcacgga gggtaagaat gtctacacag agatcaagtg 4440 caactccctg ctcccgctgg atgacatcgt tcgcgttgtg acccacgagg actgcatccc 4500 tgaggttaaa attgcataca ttaacttcct gaatcactgc tatgtggata cagaggtgga 4560 aatgaaggag atttatacca gcaatcacat gtggaaattg gttgagaatt tccttgtaga 4620 catctgcagg gcctgtaaca acactagtga caggaaacat gcagactcga ttttggagaa 4680 gtatgtcacc gaaatcgtca tgagtattgt tactactttc ttcagctctc ccttctcaga 4740 ccagagtacg actttgcaga ctcgccagcc tgtctttgtg caactgctgc aaggcgtgtt 4800 cagggtttac cactgcaact ggttaatgcc aagccaaaaa gcctccgtgg agagctgtat 4860 tcgggtgctg tctgatgtag ccaagagccg ggccattgcc attcccgtgg acctggacag 4920 ccaagtcaac aacctctttc tcaagtccca cagcattgtg cagaaaacag ccatgaactg 4980 gcggctctca gcccgcaatg ccgcacgcag ggactctgtt ctggcagctt ccagagacta 5040 ccggaatatc attgagagat tgcaggacat cgtctccgcg ctggaggacc gtctcaggcc 5100 cctggtgcag gcagagttat ctgtgctcgt ggatgttctc cacagacccg agctgctttt 5160 cccagagaac acagacgcca gaaggaaatg tgaaagtggc ggtttcattt gcaagttaat 5220 aaagcataca aaacagctgc tagaagaaaa tgaagagaag ctctgcatta aggtcctaca 5280 gaccctgagg gaaatgatga ccaaagatag aggctatgga gaaaagggtg aggcgctcag 5340 gcaagttctg gtcaaccgtt actatggaaa cgtcagacct tcgggacgaa gagagagcct 5400 taccagcttt ggcaatggcc cactgtcagc aggaggaccc ggcaagcccg ggggaggagg 5460 gggaggttcc ggatccagct ctatgagcag gggtgagatg agtctggccg aggttcagtg 5520 tcaccttgac aaggaggggg cttccaatct agttatcgac ctcatcatga acgtatccag 5580 tgaccgagtg ttccatgaaa gcattctcct ggccattgcc cttctggaag gaggcaacac 5640 caccatccag cactcctttt tctgtcgctt gacagaagat aagaagtcag agaaattctt 5700 taaggtgttt tatgaccgga tgaaggtggc ccagcaagaa atcaaagcaa cagtgacagt 5760 gaacaccagt gacttgggaa ataaaaagaa agacgatgag gtagacaggg atgccccatc 5820 acggaaaaaa gctaaagagc ccacaacaca gataacagaa gaggtccggg atcagctcct 5880 ggaggcctcc gctgccacca ggaaagcctt caccactttc aggagggagg ctgatcccga 5940 cgaccactac cagcctggag agggcaccca ggccactgcc gacaaggcca aggacgacct 6000 ggagatgagc gcggtcatca ccatcatgca gcccatcctc cgcttccttc agctcctgtg 6060 tgaaaaccac aaccgagacc tgcagaactt cctccgttgc caaaataaca agaccaacta 6120 caatttggta tgtgagaccc tgcagtttct ggactgtatt tgtggaagca caactggagg 6180 ccttggtctt ctgggcttgt atataaatga aaagaacgta gcgcttatca accaaaccct 6240 ggaaagtctg accgaatact gtcaaggacc ttgccatgag aaccagaact gcatagccac 6300 ccatgaatcc aatggcattg acatcatcac agccctgatc ctcaatgata tcaatccttt 6360 gggaaagaag aggatggacc ttgtgttaga actgaagaac aatgcctcga agttgctcct 6420 ggccatcatg gaaagcaggc acgacagtga aaacgcagag aggatacttt ataacatgag 6480 gcccaaggaa ctggtggaag tgatcaagaa agcctacatg caaggtgaag tggaatttga 6540 ggatggagaa aacggtgagg atggggcggc gtcccccagg aacgtggggc acaacatcta 6600 catattagcc catcagttgg ctcggcataa caaagaactt cagagcatgc tgaaacctgg 6660 tggccaagtg gacggagatg aagccctgga gttttatgcc aagcacacgg cgcagataga 6720 gattgtcaga ttagaccgaa caatggaaca gatagtcttt cccgtgccca gcatatgtga 6780 attcctaacc aaggagtcaa aactacgaat ttactatact acagagagag acgaacaagg 6840 cagcaaaatc aatgatttct ttctgcggtc tgaagacctc ttcaatgaaa tgaattggca 6900 gaagaaactg agagcccagc ccgtgttgta ctggtgtgcc cgcaacatgt ctttctggag 6960 cagcatttcg tttaacctgg ccgtcctgat gaacctgctg gtggcgtttc tctacccgct 7020 taagggagtc cgaggaggaa ccctggagcc ccactggtcg ggactcctgt ggacaggcat 7080 gctcatctct ctgggcatcg tcattggcct ccccaatccc catggcatcc gggccttaat 7140 tggctccact attctacgac tgatattttc agtcgggtca caacccgcgt tgtttcttct 7200 gggcgctttc aatgtatgca agaaaatcat ctttctaatg agctttgtgg gcaactgtgg 7260 gacattcaca agaggctacc gagccatggt tctggttctg gatgtcgagt tcctctatca 7320 tttgttgtat ctggtgatct gtgccatggg gctctttgtc catgtattct tctacagtct 7380 gctgctttta gatttagtgt acagagaaga gtctttgctt aatgtcatta aaagtgtcac 7440 tcgcaatgga cggtccatca tcctgacagc agttctggct ctgatcctcg tttacctgtt 7500 ctcaatagtg ggctatcttt tcttcaagga tgactttatc ttggaagtag ataggctgcc 7560 caatgaaaca gctgttccag aaaccggcga gagtttggca agcgagttcc tgttctccga 7620 tgtgtgtagg gtggagagtg gggagaactg ctcctctcct gcacccagag aagagctggt 7680 ccctgcagaa gagacggaac aggataaaga gcacacatgt gagacgctgc tgatgtgcat 7740 tgtcaccgtg ctgagtcacg ggctgcggag cgggggtgga gtaggagatg tactcaggaa 7800 gccgtccaaa gaggaacccc tgtttgctgc tagagttatt tatgacctct tgttcttctt 7860 catggtcatc atcattgttc ttaacctgat ttttggggtt atcattgaca cttttgctga 7920 cctgaggagt gagaagcaga agaaggaaga gatcttgaag accacgtgct ttatctgtgg 7980 cttggaaaga gacaagtttg acaacaagac tgtcaccttt gaagagcaca tcaaggaaga 8040 acacaacatg tggcactatc tgtgcttcat cgtcctggtg aaagtaaagg actccaccga 8100 atatactggg cctgagagtt acgtggcaga aatgatcaag gaaagaaacc ttgactggtt 8160 ccccaggatg agagccatgt cattggtcag cagtgattct gaaggagaac agaatgagct 8220 gagaaacctg caggagaagc tggagtccac catgaaactt gtcacgaacc tttctggcca 8280 gttgtcggaa ttaaaggatc agatgacaga acaaaggaag cagaaacaaa gaatgggtct 8340 tcttggacat cctcctcaca tgaatgtcaa cccacaacaa ccagcataag caaatgaaag 8400 aaaggaattg tatttacctt ttataattat tattagtgtg ggtatggcta atgagttctg 8460 attcacccac gaaggttaca tttatgctga atacatttgt aaatactcag ttttatactg 8520 tatgtatatg attgctactc taaaggtttg gatatatgta ttgtaattag aattgttggc 8580 atgatgacat ttcatttgtg ccaaaaatat taaaaatgcc ttttttggaa ggactaacag 8640 aaagcacctg atttgcactt gaaccagtcc g 8671 65 2706 DNA Homo sapiens 65 ctctgtagct gtgaccctga taccgcgtgg tgtgctccga acacatggtg cccagaacga 60 aggcggcgtc cagaagccct aggtcccaga ggtccgctca gcggcaggcg cataaggcgg 120 ggccggcgcg ggcctttcct tccatcggaa ccgttctccc ggggctgagt ccctgcccgg 180 actccgaacg ccgaagacca ggggccggaa gcgcgcgccg ccactgccac gccgtgtcag 240 tcgggaggga gggagcgagc aggcgaagcc gcggaggacg gggtgaagat ggcggccttc 300 tccgagatgg gtgtaatgcc tgagattgca caagctgtgg aagagatgga ttggctcctc 360 ccaactgata tccaggctga atctatccca ttgatcttag gaggaggtga tgtacttatg 420 gctgcagaaa caggaagtgg caaaactggt gcttttagta ttccagttat ccagatagtt 480 tatgaaactc tgaaagacca acaggaaggc aaaaaaggaa aaacaacaat taaaactggt 540 gcttcagtgc tgaacaaatg gcagatgaac ccatatgaca gaggatctgc ttttgcaatt 600 gggtcagatg gtctttgttg tcaaagcaga gaagtaaagg aatggcatgg gtgtagagct 660 actaaaggat taatgaaagg gaaacactac tatgaagtat cctgtcatga ccaagggtta 720 tgcagggtcg ggtggtctac catgcaggcc tctttggacc taggtactga caagtttgga 780 tttggctttg gtggaacagg aaagaaatcc cataacaaac aatttgataa ttatggagag 840 gaattcacta tgcatgatac cattggatgt tacctggata tagataaggg acatgtcaag 900 ttctccaaaa atggaaaaga tcttggtctg gcatttgaaa taccaccaca tatgaaaaac 960 caagccctct ttcctgcctg tgttttgaag aatgctgaac tgaaatttaa cttcggtgaa 1020 gaggaattta agtttccacc aaaagatggc tttgttgctc tttccaaggc accggatggt 1080 tacattgtca aatcacagca ctcaggtaat gcacaggtga cacaaacaaa gtttctcccc 1140 aatgctccga aagctctcat tgttgaacct tcccgggagt tagctgaaca aactttgaac 1200 aacatcaagc agtttaagaa atacattgat aatcctaaat taagggagct tctgataatt 1260 ggaggtgttg cagcccggga tcagctctct gttttggaaa atggagtaga tatagttgta 1320 ggtactccgg gaagactaga tgacttggtg tcaactggaa agctgaactt atctcaagtt 1380 agattcctgg tcctggatga agctgatggg cttctttctc aaggttattc tgattttata 1440 aataggatgc acaatcagat tcctcaggtt acctctgatg gaaaaagact tcaggtgatt 1500 gtttgctctg ccactttgca ttctttcgat gtaaagaaac tgtccgagaa gataatgcat 1560 tttcctacat gggttgactt aaaaggagaa gactctgttc cagatactgt acaccatgtt 1620 gttgtcccag taaatcccaa aactgacaga ctctgggaaa ggcttggaaa gagccacatt 1680 agaactgatg atgtacatgc aaaagataac acaagacctg gtgctaatag tccagagatg 1740 tggtctgaag ctattaaaat cctgaaaggg gagtatgctg tccgggcaat caaggaacat 1800 aagatggatc aagcaattat cttctgtaga accaaaattg actgtgataa cttggagcag 1860 tactttatac aacaaggagg aggacctgat aaaaaaggac accagttctc atgtgtttgt 1920 cttcatggtg acagaaagcc tcatgagaga aagcaaaact tggaaagatt taagaaagga 1980 gatgtaagat tcttgatttg cacagatgta gctgctagag gaattgatat ccacggtgtt 2040 ccttatgtta taaatgtcac tctgcccgat gaaaagcaaa actacgtaca tcgaattggc 2100 agagtaggaa gagctgaaag gatgggtctg gcaatttccc tggtggcaac agaaaaagaa 2160 aaggtttggt accatgtatg tagcagccgt ggaaaagggt gttataacac aagactcaag 2220 gaagatggag gctgtaccat atggtacaac gagatgcagt tactatctga gatagaagaa 2280 cacctgaact gtaccatttc tcaggttgag ccggatataa aggtaccagt ggatgaattt 2340 gatgggaaag ttacctacgg tcagaaaagg gctgctggtg gtggaagcta taaaggccat 2400 gtggatattt tggcacctac tgttcaagag ttggctgccc ttgaaaagga ggcgcagaca 2460 tctttcctgc atcttggcta ccttcctaac cagctgttca gaaccttctg atttttacat 2520 ttactgaata agatttgagt aatgaaagtc tgtagtctta aaactctaaa acagttgtac 2580 tgcttccaag cagcagtatt tatagtaacg taagctatta atgctaactc ttgcatgtca 2640 agaaacatta gtcttaggaa ttcttcaaaa aatggcatcc caatgaaaat aaatttgatg 2700 actata 2706 66 19974 DNA Homo sapiens 66 cattccttcc tcccccacac tgttttatat tttataaagg gaattcacag cctagtggat 60 acggggttga gtaaaagtac ttcctgcctc tattgaagct tcaggacact ggactggagt 120 cttccacaat tcagccagaa cccagggtac tccaaaccac tttgtgatcc tccatctctg 180 taggctacag tttataaaac acttctatct ccatgatttc atgtgattcc cagagtgact 240 catgaggttt cagacctctc catcctcaga cagaccctgt cacatgcctc tgaaaatgct 300 ttagttgttg ccatgtggat ggacactcag aaaatacttt aaaaaaaaaa aatctaaact 360 gcttgcttta gtggttttgt gtaaaatgca accagagaga gagagagaga gaggtccttg 420 gcagataggt ttaactcaaa tcactgagca ggaaaatatc tgggaatctg gcctggtttg 480 agaaaaagag ccagcttttt tagctgacta gagagcagct taaaaaggaa aaaaatgaaa 540 ggaaggaaac taatgaggca tctctttcct cacagataat gtgggatgct cacccagtgg 600 tttgcagtgg ctacagcatc tgttacactt cagtgacaac caaaatgcct tggagataag 660 acaacaatta tgagacctgc ttgagaagaa atgaaagaaa acccaatcgc agccaaatag 720 cactgccatc tattaacaaa ttattacagg ctgaagcagt ctttaatgag attttaacaa 780 aagaaatcca gtcttgttcc tctggggcaa agagatttta tcatcatgtg aatggttaag 840 caggtaagag gacctccctt ggagcaccaa taaagagaaa atgtgtgagc cctgtgttct 900 ccctctccaa ggagctctaa ctcccagtca agatgaaaac atattggaga aaggacccaa 960 ataatactgt tccagagaaa gttttagttt agtaagtggc ctgatgagac aagtgagtat 1020 tcaaaaaagc tgtattagac ctttgtgttg tgattccaat cctttgtgct tttggttatc 1080 ctaaaggaca ttgtgacatt ctcaagagat tcttaagtca aagggtatag tggcaacatt 1140 aatctggagg aacaatgaag ctagctagag tcaggacgtg cctgcccatg atcttgtaga 1200 cccactgggg tttcagaaga gaaactgata tgagtctaat gttgttagtt caaaggagga 1260 aagaaaaatt ggttgcatgg agtgagtcag agggagggag gaaaatatgc agtccaagga 1320 agtcggtgag aagcccgctc agatgcacta gactgtcttg catgaattgg gtttttcttt 1380 ttgcttctcc cacccctttc attgatccac tacgtgaaca cttataggca gccaataata 1440 tcaggagaaa tatagccatt ttgatactaa tatttctgtt acccaggaat ttgctttatt 1500 tgcaatatga gcctccaatg cctgattaaa ggaaactgag gactaacctg ggtgatccaa 1560 gaattcctag actgttaaat agtgtagttt ctttagaaag ggactctgga cctcctcctg 1620 ggaggagcct tagaagggca gtgaggctag ggtgagtaag caggctcccc acgtgagggt 1680 gagtcaggga gactgattcc ttagcagctg gggcactgtg taagactacg tcaccttctg 1740 tggcagaaga acaaccacaa tcactatgtg ttatgctggt tatattcccc ttgagaccaa 1800 tggcagacac tggctgttaa aatatttagt gtctgataat gtatgattaa tgggtaagga 1860 gtagagtagc acttctttta atggctcaca ctgtaggaaa gagatttaat agaattgaag 1920 atgaacttag aatgacatag aagataaatc tcaggaacgt acatctttgg gacagatagc 1980 tcattttgtt agaatatagt gctattattg ttattagcta acatttatta agcactgact 2040 aagtgccagg agctgcttca agtgttttac gtatgttaac ttaatgcttc ttgtgctgtg 2100 ataatgagcc agtatttgtt ttttaaattt ccaatgtgtc atggacaaat gcttttgtgg 2160 aattcaataa aaataaatta ctacaaaagt gaaaaaaaaa ttaaaacatg caaaattcaa 2220 gctcaatttt ttcctttttt aactattagc tgtaagagac atcaaattgt caaaatgcta 2280 taaaagtttc taaatcctta ctcttagttg ctatacgtat ctccccgtag actagtaaca 2340 aaccttcacc aactggcgca agtctacgaa ctctgttgaa gagcactgta ttaactcatt 2400 taaaccctcc taaaagttca tgacgtcatt aatattttca ttctacagat gaggaaactg 2460 aggcacaaaa cagttatctt gctactgagg ccagtcaatt agttttgcaa agagaaaaat 2520 atctatttca tggtcacaga agagtacccc aactagtcac tgaaaaatta agagtagcat 2580 aagaaaagca agccttaatg agaaagtatg aatggagttt tattatccag gcttctgtgg 2640 aaaaaccagg aaaggcatag gccatactgg tgacattttg ggagtgtcac cgttgctata 2700 ttgggcaata ggatgaagtg tcttaagact gtggcttgag ggctagactt tctgggttag 2760 ttttcttgct tggccaccta cttcctatgc gcagttgtta gtttctatct gaaaggatgg 2820 ttgggaggat taactgtgtt gtatatgtaa agtatttttt aaaaatgcct ggcacatagt 2880 aagtggttta taagtgttat tgtatttgca tatgaaaaac cacccattgc aaaaatttca 2940 gattccttaa atgcctccat ctatagtggt ttgacttctg attaaacaag cctatcctcc 3000 ctgccttaaa ctagagcccc tggtaaaact cagcagtgct gcagtggtca ctgccagacc 3060 tcaggtacag agcaacgttt ggggctggat aattattttt cttgaggagg gggtggttac 3120 tgtgtgcatt gtatgatagc agcattcctg gtctttaccc actagatgcc agtattatcc 3180 ccttccagct gtgacaacca aaaatgtctt cagtcattgc caaatgtccc ctgagaggca 3240 aaattgcccc cagttgagaa ccactgatac agagaaaaga gaatattgat aacagtctct 3300 tatattttca tagcttttca gtgtaaaaag gcttctgcat gttttaactc agttgggaaa 3360 aagaaaaaga ggaaaaagat aaaatagatc agaagaaaaa agagaaggga ataggggagg 3420 gagaaggaag ggagagaatg cagagtgcaa agagatgaag tggagcagac acgcaaaaaa 3480 caaaggaatg aacaaacagg tgtgtgagtt tgtgtgatcg cccatgagcg catctacacc 3540 tgcaaatcct cccagcacct cccattcaac atgttcaaaa ctgaactcag caggttgtag 3600 ataaggaagg aagctgatgt ccatggcggt ctaaggtgtg accttgggca gattcttctc 3660 agtctagttt ccttgattgt aaaatgaaga taataccacg tgccttaatg gactgtgtga 3720 gctggcatgg agctgaccta tcacagtgtg ctcttgccca tccttttccc tcctccttgt 3780 ctcatgccag taagttaagg gctaggttct ggcactcttc ctccttcaca gcactgcctc 3840 ccggacagag gtcctgtccc tcactagcac ccagtgcttc tgtccatttg atgccaccat 3900 tccattcctc ttcccactgc cccaattcat ggccagccct caacaccgca catctaatgt 3960 cctgtgggag cctcctagta gttggacttc ctggctccaa gctctcccca cgccaatcca 4020 gattcaggtt ctaaaacagc actgggatca agtcactcta attttttaaa aaactgttca 4080 ctggctcccc agcataaaat ctaaacttct tcacctatca ttaaacgcca tcctctctcc 4140 ctacttccca ctgtctgtct ccaacctatc ttcctaacct taattctcca ttgtgctacc 4200 catctctctc tctctttttt aagacggagt ttcgctcttg ttgcccaggc tggagtgcaa 4260 tggtgcaatc tcggctcact gcaacttctg cttcccgggt tcaagtgatt ctcctgcctc 4320 agccccccaa gtagctggga ttacaggcac gcaccaccat gcccagctaa ttttgtattc 4380 ttggtagaga tgcggtttca ccatgttggc caggctggtc ttgaactcct gacctcagat 4440 gatctgcccg actcagcctc ccaaagtgct gacattacag gcatgagcca ccacgcccgg 4500 cctgtacttc ccatctcttt tgaaactaaa aagtcagttt cttatgcaca ccctatgttt 4560 tcctcatctt cctgggattc ccttcctaaa tccccacagt ctatccccat ccctcaaggc 4620 ctatttcact tgttaacttc cttcatgaag cctaacctcc tcttcctgga catgtacagt 4680 tgtcctttgt tatcatcagg ggattggttc caggacccta cagataccaa aatccatgga 4740 tgcccaaatc ctttatataa aataatgtag tatttgcaca taacctatgc acatcctccc 4800 atatagctta tctctagatt acttataata cctaatacaa tttaaatgct atgtaaatag 4860 ctgttatact gtatatttta aatttgtttt atttatttta ttctcaaata attttgatct 4920 ttggttgatt gaattcacat atgcggaacc cagggacatg gaaggactat attcagtctc 4980 tctgatctct ggattctcat agcacttttt ctgcttctac tatgtagcat ttatgtcttt 5040 ccaccttgta ccataaatat ctatgtgttt atgtcttact gtgtcccgaa ttggtaggtt 5100 cttagtctca ctgacttcaa gaaggaagct gcggaccctc ctggtgagta ttacacttct 5160 taaaagcagc atgtctggag tttgttctga tggtcggatg tgctccgagt ttcttccttc 5220 tggtgggttc ctggtgtcgc tggcttcagg agtgaagctg cagaccttcg tggtgagtgt 5280 tacagcttat aaaggcagtg tggacccaaa aagtgagcag caacaagatt tattgcaaag 5340 agcaaaagaa caaagcttct acagctggaa aaagaccaga acagatcgcc gctgctggct 5400 cgggcagcct gcttttattc tcttatctgg caccacccac atcctgctga ctggtccatt 5460 ttacagagag ccgattggtc tgttttaccg agagctgatt ggtccgtttt gacagggtgg 5520 ggattggtgc gtttacaatc cctgagatag acacaaaagt tctccaagtc cccactagat 5580 tagctagaca cagagtgctg attggtgcat ttacaaactt tgagctagat acagagtgcc 5640 gattgattta caatccctta gctaaacata aaggtttctc caagtcctcg ctaaactcag 5700 gagcccagct ggcttcaccc agtggatccc gcacccgggc tgcaggtgga gctgcctgcc 5760 agtcccgcgc cttgtttccg cacccctcag cccttgggcg gtggatggga cgaaagggcg 5820 ccgcggagca gggggcggcg ctcgtcgggg aggctgtaca ggagtccacg gcgggttggg 5880 ggaggctcag gcatggcggg ctgcaggtcc cgagccctgc cccgcgggga ggcagctaag 5940 gcccggcgaa ataatcgagc cagcgtcggt gggccggcac tgctggggga cccgacgcac 6000 cctccgcaac tgctggcccg ggtgctaagc ccctcactgt ctcccgggcc ggccggccgt 6060 tccgagtgcg gggcccgcga agcccacgcc cacccagaac tcgcgctggc ccgcaagcgc 6120 cgcgcgcagt cccggttccc gcccgtgcct ctccttccac acctccctgc aagctgaggg 6180 agccggctca ggcctcagcc agctcagaaa ggggctccca cagtgcagca gcgcgctgaa 6240 gggctcctca agcgtggcca gagtgggcgc cgaggccgag gaggcgccga gagcgagcaa 6300 gggctgcgag ggctgacagc acgctgtcac ctctcaatcc tgtctactgt tctgctagag 6360 aggcaaccta aatgccaggc caaggtgagg aacaattatt tgctcgatca acttcccaga 6420 agtttctatc aataattctt ctgggttcag agttctcaga aatattgaaa caaagttggt 6480 ttctaattct gtctaaagag ataacatttt tactttgaag gagacatcat attactcttc 6540 ttacattccc acttatttct atacaggagg taacttcact aactgttcaa atatccactc 6600 tactggttcc tccccaggta gaaaaaaaaa aagattaatt ctttaatttt tatagtgagc 6660 agggctattg gcttcgaaaa caggaaacca ggcctagcag gtcgtctggt ccgggcttcc 6720 acatctatct gtcttctagg tgaagagagc ccattgcatc ttccttgccc atgcaataga 6780 gagaactagg tctgtcttgc ttgtgacagt gtgtgtgaca aaccttgtgg ccgtttgcct 6840 tagtgaatta acggcaaact cagagtctct gcactctaca cattcctcac cttggaggct 6900 cgtgcctgtc ccaaggccca gattccagtg cagaagtgag aagcccccag tttggcacca 6960 gtttcaagca gcttttctct atccgaagtt gtaccacaga ttctcatatt gtagagatta 7020 ttgggcactt gccaaagtgt tgataaagct aatatttttg tctttttttg gctagcggtg 7080 cttatgtgta ttatctaaga gtcagaatcc agagaggaaa agaaggtggt tttacatcca 7140 gaactgctgg ctctgttgct gggttaggag ctttctccat cttagtctgt tctatatcaa 7200 ggctaacctt actttcatgg atcagcgaat ctgcaggccc tttcagtgat ttttaaagcc 7260 atacacacaa atactatcta acaaccctca aagatggcct gccactagaa atttattcat 7320 tttacagtta ccaaagtagg aggatggaga caccttcagt aatatttgtt tagggaccta 7380 gaattcaaaa ctaaaaacta tgtagtctaa ttttgtcaca aatagctcat acatgttagc 7440 aagttaattc tgacaaaagt caatctaggc gggcttactt atatttctac ttatcatctt 7500 aaaaataaag tttgagttcc agttccacta ccagttgttc tgtgatttgg ataaggccta 7560 ttcctgggcc tccatttcct cgtctagaac aatgggtgtt ttgatccaga ccatcactga 7620 caaccctcct gctcctatgt ccttttcata ctcagaagtt agcatttgag cctacgttgc 7680 cttcagggtg tattttggtg gctctgaggg gtttggggag aggcgtagat ctagaggaaa 7740 atttaatgaa cataaaggga gaaacttgcc agatgcaaag tgggagttta ttccaggaat 7800 attaaattat catgaaatac atccatctac atttattaag aaaaggaagc aaaactcttc 7860 ttgcctgaag ccttcatttc taaagcatta acaaattatt catttccttc atgagctctt 7920 tagtccttgg ggtatttaac accaggagtt tgaccatatt atgactcatc cttatatcac 7980 atgataaaga tgatactaaa tattgtaaac atatgacagt ggatataagt tatgaatcct 8040 cattaattta ttttgctttt agcttaatgg cttaatactt tgcagtaaac ttctttttac 8100 ccaaaacagg actggtgtgt ctcagatttg cagtattttc atgagtaaat accactgacc 8160 acattaagaa agcattcttc ttttatgcag atgtgtgcca aagattatta aaatgagaaa 8220 gaattttact ttctcagcta ctttaaaaaa ataagatttt aaaacaggcc atatcattat 8280 ttagctgtat actctatgtt agttttccgt gactgcataa cacattatca caaacttaga 8340 gcaacatcca attcttggct cacagttctg taggtcagag tgtgggctct gctggattct 8400 ttacacagga tcaaacaggg ccaaaatcaa ggtgtttgcc tactggctct tccctggagg 8460 atctgggaag aatctgcttt caaggtcatt caggttgttg acagaattct attcctgtgg 8520 ttgtaggact gaagtcccat ttccttgctg gctattggtg gggggctgct ctttggtcct 8580 taaggtcact cacattcctt ggcagatgac cccttccttc ttcgagccag aaatggtgca 8640 tagaatcccc tttgtgcttt gaatctctga ctcctcttct gttgctagcc aaagaaaact 8700 ctctgctctt aaaaggttca tctactgggt caggcccaca cagataatct ctgtaactca 8760 agggtaactg atttgggact ttaattatat ctgtaaaata ccttctcagc attaagacga 8820 ttagtgtttg actaaccagg gacaagcatc ttgaggggac atttttataa ttctatctat 8880 ctgtcataca tgactaacca gggacaagca tcttgagggg acatttttat aattctatct 8940 atctgtcata cacagttgtc ccttggtatt cataggtgat tggtttcagg acctccccga 9000 atagcaaaat ccacagatgc tgaagtccct ggtaaaaaat agtctagtat ttgcttataa 9060 tctatgcaca ttctcctgta tactttaaat caactccagg ttacttataa tacttaatac 9120 aatgtaaatg ctatgtaagt agttgttttt agggaataat gacaaagaaa aatggtctgt 9180 acatgtttgg tacagatgca atatcttttt ccaaattatt ttctacctgt ggttgattga 9240 atccatggat acagaactca cggatatgga aggccaactg tactttaatg caattaagca 9300 aattatctaa cctctttttt tttttttttt tttttaagaa tcctactctg acagctttaa 9360 tgagtagacc tacaaaggga ttatgcttat gataactttt ctacatcagc caacaaaaca 9420 aaataataaa aacaacaaca aaaactgttc acactggatc tataagatgc caaagcacat 9480 gaataaaaaa tagacttaat cattatcagt cttgttcact gccatttgcc cagctcttgg 9540 aacagagcct ggcccaaagt aggtgctcat taaacatagt gtttgctgaa tgaatggatg 9600 tacagcacta attacagccc tttagttaac tattcatcaa gtgctccctg cacacagtgg 9660 acttgtgctg agtatagtga aggaagattc attgtgctgg ttgattttat gtgtcaaccc 9720 agctaggcta tggtacccag ttgtttggtc aaacactagt ctatatgttg ctgtgaaggt 9780 attttttaga tgtgataaca cttagataag taaagtttga gttaagcaga ttgccctcca 9840 taatgtgggt gggtctcatc caatcaattg aagattttaa gagaaaagac tcaaagcggg 9900 tggatcactt gaggtcagga gttcaagacc agcctggcca acatggcaaa accctgtcta 9960 tactaaaaaa aaacaaaaac aaaaatgagc cgggtgtggt ggcacatgtc tgtagtccca 10020 gctactcaga agggtgaggc aggagaatca cttgaaccca ggaggcagag gttgcaatga 10080 gccaagattg caccactgca ctccagcctg ggtgacaaag cgagactgtc tcaaaaaata 10140 aattaaaaaa aaaaaaaaga aaaaagactg aggttccctg aggaagaaga aattctgcct 10200 tcagatttct ttcagactca aaactgcaat atcaaccctt ccatgggtct gtggtattct 10260 gggaacaggg aaaagaacag cccaactaaa gaatcccaaa ggtagagaaa catgctgtta 10320 tggaacctgg agctaggttg cagagagccc tgataagcat agaacatagg ctgcaaaagc 10380 cttaaagatt catcttttaa ataaacagca ttaagtactt ctctttcttg ggcaggctag 10440 agttcctatc atcttgccca ggcctctcat cttgtaacaa gaagctgaga cccagggcga 10500 gtgtccttgg gtcacagggg aaatctttgc aggagaatgc aatcttctac tgcctcagcc 10560 agcactctgg ctaccacatt ttcctgcttc ctgcgtctcc aaactcatct gcccgctccc 10620 gcgttgctta ctgcaactca accacactgg ttctcttgct gctcctcaaa tgctctaagt 10680 ttgttcccac ttcagaacca ttacacttgt tccctctgcc caaaagccaa aaggaccttg 10740 ccttgggttt tcacacactg gagcttcttg tctctgatct cagctcagct gttgtcttct 10800 cagaaaagtc ttccctgacc acaggcttga atggtgcctc atccacatct gtcccattgc 10860 cctgatttac tttacttttt tttttttttg aaattttttt tgagacaggg tcccactctg 10920 tcgcccaggc tggagtgcag tggcacaatc tcagctcact gcagcctccc tttcccggct 10980 tcctgggttc aagtgatttt catgcctcag tctcctgagt agctgggact ataggcatgt 11040 gccaccacgc ccggctaatt tttgtgtttt tagtagaaac ggggttttgc catgttggcc 11100 aggctggtct tgaattcctg agctcaagca atccacctgc ctcggactcc caaagtgctg 11160 ggactacagg tgtgagcccc gcatccaggc tactttctct attgtactaa tcaatgtcta 11220 gaattactat tatttacctc tctattgttt gtttccccaa atagaatata aattccctga 11280 gagtagaatt tgttccctaa tgtgatttgc tcattgcttt aaccccagca ccttgaataa 11340 tgtacatatt aggtactcta taattatgaa tcgaattaat gagtgttgaa tacagatctc 11400 agctcttgca aagtggttta aaggggaaaa gagatcaaga gaagcaaaga ggcattttaa 11460 tgaggtaagc ggagaagccg ccagcaatca cctctggtcc cacagcaatg ccctttccct 11520 tcttccattt ttttttcaaa gcagatgcct ctagatggaa ttgttaaggt cacagttctt 11580 gctggataaa aatcacaaat cgtgctgact tcagctcacc attaaaattc cccacaatgc 11640 cagtttcaaa acaaagtgct ctccgaaaga tgaaacctta tgtgttcctc ccttaacgtt 11700 cccctcttaa ccgcgcatca tatgaagcct gtcagcttcc ctggactcac tgtgctttat 11760 cactaatcct tcacttttcc ctttctgcct caaaggagga tgctccccgc ttctcttact 11820 gaaccaaaca tggatttgct tatgctgtgc agtaaagcta aacatccacc ctgaggtttt 11880 gcagcaaaag aaaggaggat gtttatttgc agggccccaa acaaggagaa tctggcagtt 11940 tacacttcag acctgaccta cttgatggct tacaagcaag gctttttgtt gttgttgttt 12000 tgttgttgtt tgtttgtttg tttgttttga gacagggtct cactctgtca tccacgttca 12060 gcgtgtgcag tagtgtgatc tcagctcact gcaacctcca cctcccaggc tcaagcgatt 12120 ctcctgcctc agcctccaga gtagctggga ttacaggcat gcaccactat cgcccgacta 12180 attttgtatt tttagtagaa atggaaaccg gtttcaccat gttgttcagg ctggtcttga 12240 actcctgggc ttaagcgacc ctcccacctt ggcctcccga agtgctggga ttacaggtat 12300 gagccaccgt gcccagcctc aagcaaggtt ttttcaaagg ctgagataaa tttcaggaaa 12360 tcagaaatca caagcaaaat ggtaaatcaa tacatggagg ttatatattg gtttggccta 12420 aaaaggtggt atatctcgaa ttggggcttt acaggtcaca ggtggattca aagactctct 12480 ggtttgcaat tcactaagga agcaaagctt tgtctaaaca tttggggttg gaagaaaaga 12540 atgttaagat ctagcatgtg ggcatgactt tctccagggc cctcaggaag aaatttagaa 12600 caaagaatga tagagttcag tcctcagttc tcccgtttga ggtctacatg ccagcagaca 12660 gtattttcca tttggtgggg tgcgggggaa ggtaggggcg ggtccaggtt tctgaaaaac 12720 aactcaggga cgaggtatgt taagatgtta tctttagttt ctctagagaa ccaaacatcc 12780 tgtgactcta actttcttgg ctattgtttt aagctattag attactcatt tatttctcaa 12840 ggctagctag gtgcctggaa tttcccttga agaaactcaa gattttcctt tattcccatg 12900 cttgggttgg gggtgggaag gcaggatcct aaaaggggtc cctgctctgt ctcacttcca 12960 tttcaggatt ttcttcttta cttgttctct gaaccccttc ccaactgatc ttcctcctgc 13020 ctcagatcca catttcacat tgccacccaa ggatcttcaa taaccactta tatgaatatt 13080 gctctcttac tcggatacct tcactggctc ctcaaactca cattattgag catggtactc 13140 aaggccctcc ccaaggacaa agaccagctt accctcttca cccttatttc tacgtgccct 13200 agcattaaac tgacgtcaaa gcatatagga tataactaac tacctgcaat tttaatgcaa 13260 tggagctgtt ggcttgtagc tatgacagga aaagacattt gaccacaggg tatgcagggt 13320 atctggtgca aaggcattct ggcagagcaa acccaacaca tcagttttac tgtttcccat 13380 gacttgcgaa tccaccctgg agaggatttg aatcagttga cttcttcctc taaaaccttc 13440 ctagctgcat ggctgtttgt gtgtgtgtga ctgcattatt gagtacacaa gccttttgaa 13500 tcagggatct cagattttag aactgtcaat tatttcacat catcctaaac ctctcttccc 13560 ttccattagc gtggataatt ggaagctgat gagggttagc attacacaag gagataaatt 13620 cattcaacaa acattcagca aacactcatt aagtacctaa tgtatgccac cttgagatac 13680 agtgtagaag aggacacaac catgtaagca tcattacttc agaagaagaa atgctataat 13740 aaaggtatgt tcaatattag gtgtactgag ggaacataga gacagtgctt ctctaacaat 13800 ctaaagagga cagggaaggc tttggggaga aggtgacaag acgtggagga tgtgaatctg 13860 aggcaaggat agaaggagaa gtaaatttgc cagacttcac ccccatacca tcagtataat 13920 ttgagaaata tttttaaaca tcatgactgt acagattttt ttagtggcat tatttgggcc 13980 atactgagtg attgctgcat gcatttctac tcctttgacc tactttttca taagtgtcaa 14040 cagaggcttg caaacaacca ctaaaatagt gttcacaatg tatcaggacg gagttttgaa 14100 ttagaggcgt ccatgctgtg ctgaagctgg tggtcttgtg agcagaaaca ctttgctaag 14160 gttgtaagag agaagcgaag gaagcatagc agggcaagag gaggaaagag acagagcaag 14220 cctaaagcta aagagttgtg aggggtggag cccgggtcat tcagggattt ttttaaagtg 14280 ggttgctcac aatgtttctg gaagagcgta caacttagcg gacagaaccc atatgacttg 14340 taaattacaa ggcagtcatt acatagccaa gctacaggaa gggcgccttc tgttagagtg 14400 gctacgtaac tctcgatttc aaaatgccta ttggttcatt tgcaggactc ggcaaaaaat 14460 aagcacaaga gtacatcaaa cttagcattt tggatcaagc caattttgtt gccaaaggac 14520 acttcatgag atcagtagta gctgcctcta tgttgatcat ctaaagtcat agaggtacca 14580 tattctcaac ttcttcctat gaatttcagt ttttaaatta ctgagtcaaa agcagtaagc 14640 atttaaatca cttttgataa ctagcctata tagtgtttat caaaagagct gtattaattt 14700 actgtaccac caacagggta catgtatacc agattcacca cgatctgtca tacatttagc 14760 aaaagtaagt tagtgagtta acaatttatg atttagtata tttacaactg tgtccggaat 14820 tgatgggttc ttggtctcac agacttcaag aatgaagccg tggaccctcg cggtgagtgt 14880 tacagctctt aaggtggcgc gtctggagtc tgtcccttct gatgttcaga tgtgttcgga 14940 gtttcttcct tctggtgggt tcagtggtct tgctggctca gaagtgaagc tgcagacctt 15000 tgcggtgagt gttacagctc ttaaggcagg gtgtctggag ttgttcgttc ctcctggtgg 15060 gttcgtggtc tcgctgggtt caggagtgaa gctgcagatc ttcgcgatga gtgttacagc 15120 tcataaaagc aacgtggacc caaagagtga gcagtagcaa gatttattgc aaagggcaaa 15180 agaacacagc ttccacagtg tggaagggga ccccagcggg ttgccaatgc tggctcggtc 15240 agcctgcttt tattctctta tctggcccca cccacatcct gctgattggt agagctgagt 15300 ggcctgtttt gtcaggctgc tgactggtgc ttttacaatc cctgagctag atacaaaggt 15360 tctccacctc cccatcagat tagttagata cagagttttg acacacacgt tctccagggc 15420 cccaccagag cagctagata cagagtattg attggtgcac tcacaaacct tgagctaaac 15480 acagggtgct gattggtgta tttacaatcc ctgagctaga tataaagact ctccacgtcc 15540 ccaccagact caggagccca gctggcttca cctagtggat cccgcaccgg ggctgcaggt 15600 ggactgcctg ccagtcctgc gccgtgcgct cgcattcctc agcccttggg tggtcgatgg 15660 gactgggcac cgtggagcag gggttggtgc tcgtcgggga ggctggggcc gcacaggagc 15720 ccatggagtg ggtgggaggc tcaggcatgg tgggctgcag gtcccaagcc ctgccccacg 15780 ggaaggcagc taaggcccgg taagaaatcg agcacagcgc tggtgggccg gcactgctgg 15840 gggactcagt acaccctccg cagccactgg cccggatgct aagtcccgca ttgcccgggg 15900 ccagcaaggc tggctggctg ctccgagtgt ggagcccacc aagcccacgc ccacccggaa 15960 ctccagctgg cctgcaagcg ccgcacgcag cccgggttcc cgctcgtgcc tctccctcca 16020 cacctccctg caagctgagg gagtgggctc cagccttggc cagcccagaa aggggctccc 16080 acagtgcagt gggggggctg aagggctcct caaatgccgc caaagtggga gcccaggcag 16140 gggaggtgcc gagagcaagc gagggctctg aggactgcca gcacgctgtc acctctcaaa 16200 acagtacctt aacaatggtt taacgtgcat tcaacaaaca tccaggaaac aattattaag 16260 cacttaatgt actttgatta ctggagttat ctttttaggc tggcccaaag ccactagaaa 16320 ggtccatttg aatcaggcat cattgctgat ttgatgattt ctccctttcc ttcatccgcc 16380 ataggaatca acatatatat atatgtgtgt gtatgtatac atatgtatat atacagtcat 16440 tgagttcttt tgactctctc tttcaaaggt ctcttacatc tatctatcct actattttta 16500 ttcctgctac catcagcata gttcatagtg cagaaagctt aagtctgtgc tactgaaaat 16560 gtcttcacag gtctctctgc cttcagtctt ccccatggta ataatcacaa ctcatttact 16620 gagcaactac tatgagccaa gctgactgct gggtgtttca tagacattac ctcccacaac 16680 aaccccatgg ggcagatatt aactccactt tacagatgag gaaactgaga cacagaaaat 16740 taagataatc actgaattta cccaacagta agtggaagag cagatatttt aacctgagtc 16800 tgagtccaaa acctgtgtcc tcaatgaaac atttctcgaa tcagtcctct aacccattct 16860 aaacattcct gtacttgtta agtcattgcc tcttaacttc aaacccctcc tatgataccc 16920 agctttgtga tgctcagctg ggactctgca aaacagtccg ctttgccagt tagctctgtg 16980 ttgggctctg ccaatagggg gtgctagagg aagagctcaa ggctggagga gggaggacaa 17040 agtcctcaaa gcagcccaac tacctgcttc ccgtggactt cctgtgagca tcacttctgt 17100 ggaacttcat cccaacagca gcagttctgt cccacagtag cagctaaaat gcagtttgcc 17160 attttcccaa cacttacaga atgagtttta tcacatcccc ctcaaagatg ccagcattag 17220 ccagcaggtg ccccttttcc gagatgcggg tcccagcccc acttaggccc tcctcccaaa 17280 cagtatctcc tcttcaagtc tgagctttgg cttctcagag ctgctcctct tcagcttcca 17340 gggtttagtc attctaatat cttccttttg ttccctcagc cgtaggggtg gtagtgtttt 17400 gatgctaact tcataatatc ttagagttcc tttttacctg ttcaagtatc tagttaacac 17460 atttatacct cattagcaat tctggaggag ctgttgccag aaggaactta ctccagaaag 17520 caagtgaacc tgagtgatgc aaggagtgaa acatattgga tactgtgttg ttcccccaag 17580 tcccctcttt tgggccacag cacccatctt ccagttgctg ggggtatcta cagctgcatc 17640 ccctgggttt catggaactg cctcacccaa ggctacagcc cctgtcagca gtgtctcatg 17700 gccaatgacc tgctgaatga gggcatataa aggccaaccc ccttgcctta atctgggctc 17760 caggattcct gtgggatcag cagaagcctc aggagctacc tcattgtgag tcactttctc 17820 cctctgccca atcttcctgc tttcctcact tatttccatg tgtttctccc cagagtattc 17880 ccaataaacc ttctgcaggt aactgtctat ctcagtcttt ttccaagtat ttctggtacg 17940 ctaaccttaa ttagtgatat ccaacatcac aaattagtga tgtctgtaga gatttcgatg 18000 ggcactatga ttgaacttgg agtggtggag agcaaatacc aagaagaaaa ctattgctac 18060 aggaacagtt gataaatccc atctccataa acccaaagca aaataacaac aatatagaac 18120 tatcattcct gaaaaatagc ctaatgcagg aagggtgtgg tagtcacatc ctggtagcat 18180 tttcaggata tttttgttgt tctgtgtttt caactacttt ttagctttaa caagtcaaat 18240 tgccttgcta ccccgaggcc gcttcagaat acatatgctg cattggtact tgaaatctgc 18300 atatcttatt tttgttttcc actgcctgta attctataaa taaaagggaa aataatgtat 18360 gaaatttcaa cttatctgca attcatatta tataaaacat taacttttgg gaagaaggaa 18420 ttgatttaaa ttattaaaac cctaaaaatg ctaatacaca aaacagtcat ctctcataac 18480 attatgtatt ctaagaagtt gctacttgag tttttcaaag gcttactcat gaactaattt 18540 tctcagtctg caaacaaatt taagtgttca taaatgccat taaaattata tcctgcagtt 18600 ttgaaacaat attcaattca tttttaatga aagaaaatat atataaagga gataaaatat 18660 tcttcaatta gtgttatggt ttatacttca ggcaaagaat aaagcctccc tgaaatattt 18720 caagaaaatt gactattttt ggtttatttt taatatcatt ataagccata actacatttt 18780 gttttaggga gataaacatt tatcacaacc aaatgtatgc ctaccagaag tttgattatc 18840 tgaataaact cgagttgagg aatatttaat atgtcactta aagttaatac ctaggaagaa 18900 gtctcaatta aatcccaggt atagaggtga gtcctcatgt tttatttggc ctgatatgat 18960 attgctgggt ctttcttaag gccttaagat agtgctgtga aaatgtcatt tgtatatttt 19020 tatgatacct gcattttgtt tggatgtatt ttatatgttc ttatttatac cgtagcttat 19080 acttcctgac agtttttcat tttaaattgg atcataaata tccttcactt aatagtttct 19140 tgattttcct ctttgccatt tcttcagatt ttatcttcaa aataacctct cattttcttg 19200 acttcctgga tagcacagag aagaagacac attcttctct gaacaagggg gaggaaggaa 19260 ggaattacaa taagaagcct aggtagagga agagtgggca gtgagatact tttcctgcta 19320 tcaaatgtct tctctaacat ttagaaaggt gtggaaaggc cgctagttta aagactaaca 19380 aactaggaag gctgaccaca tattagcaga gtttaatatt ttaaccaact gaatattaaa 19440 tttatacaca aacacaagat taaatacata ctcgttaata tgctacctag caactagggc 19500 tagctattta accagtacat ataggtaact gtaactgagt ccttgaactg tggttaaaac 19560 cttctgaaaa tatttgtgcg tgtgtgtgtg tgtgtgtgtg tgtgtgtgta tttacactgc 19620 tcctaaattg acaatggatt tcaggagact ctagaattat atatacagtg aagtaggaaa 19680 gttgaattaa aaaacaaaca aaaaaaacag aaacaactag tgtgctggtg tttgctggtt 19740 acaaagcaac cagtcccctt ttccctcctc ctacctttac ctttgttttt gttttttttt 19800 acctccccag tgtaactggg agaaaactga ccccactacc agatccaggg ttctgattag 19860 tctaaatcaa tcaggatctt acgatctgga ttaggccatt tttgcattgc tataaagaaa 19920 tacctgaagg ccgggtaatt aataaggaaa agaggtttaa ttggctcatg gttc 19974 67 1726 DNA Homo sapiens 67 gagtcggaga cactatccgc ttccatccgt cgcgcagacc ctgccggagc cgctgccgct 60 atggatgatc gagaggatct ggtgtaccag gcgaagctgg ccgagcaggc tgagcgatac 120 gacgaaatgg tggagtcaat gaagaaagta gcagggatgg atgtggagct gacagttgaa 180 gaaagaaacc tcctatctgt tgcatataag aatgtgattg gagctagaag agcctcctgg 240 agaataatca gcagcattga acagaaagaa gaaaacaagg gaggagaaga caagctaaaa 300 atgattcggg aatatcggca aatggttgag actgagctaa agttaatctg ttgtgacatt 360 ctggatgtac tggacaaaca cctcattcca gcagctaaca ctggcgagtc caaggttttc 420 tattataaaa tgaaagggga ctaccacagg tatctggcag aatttgccac aggaaacgac 480 aggaaggagg ctgcggagaa cagcctagtg gcttataaag ctgctagtga tattgcaatg 540 acagaacttc caccaacgca tcctattcgc ttaggtcttg ctctcaattt ttccgtattc 600 tactacgaaa ttcttaattc ccctgaccgt gcctgcaggt tggcaaaagc agcttttgat 660 gatgcaattg cagaactgga tacgctgagt gaagaaagct ataaggactc tacacttatc 720 atgcagttgt tacgtgataa tctgacacta tggacttcag acatgcaggg tgacggtgaa 780 gagcagaata aagaagcgct gcaggacgtg gaagacgaaa atcagtgaga cataagccaa 840 caagagaaac catctctgac caccccctcc tccccatccc accctttgga aactccccat 900 tgtcactgag aaccaccaaa tctgactttt acatttggtc tcagaattta ggttcctgcc 960 ctgttggttt tttttttttt tttttttaaa cagttttcaa aagttcttaa aggcaagagt 1020 gaatttctgt ggattttact ggtcccagct tttaggttct ttaagacact aacaggacta 1080 catagaggct ttttcagcat tactgtgtcg tctccgtgcc agatgtggca agatcaccat 1140 tagcaaatgg aaattacatt tgaaagccat tagacttata ggtgatgcaa gcatctaaga 1200 gagaggttaa tcacactata gaggcataag tggtatcagt tttcattttt ctaattgttt 1260 aaactgtgtt ttataccagt gtttgcaagt aattgggtgt tagcttgaga tggttaaagg 1320 tggtttgggg agggacttcg ttgtaatggt tttgctgtaa aaaatgtttc caactccgct 1380 gaaatgttgc tgaaaagcat ggtgctggta acagttcaac aatccgtggc tgctcattct 1440 tgcctacttt actctcccac tgaagcaggt tagcgttgaa ggtggtatgg aaaagcctgc 1500 atgcctgttc aattcttttg tttcttctcc ttccccctcc ccctacctcc ttcccctcac 1560 tcctcccctc cttcgctcgc tcaacctctt ttgttcagta tgtgtaactt gaagctaatt 1620 tgtactactg gatatctgac tggagccaca gatacagaat ctgtattgtt cttactgaaa 1680 cacagcatgg aattaacatt aaacttaaat aaaacaaacc taaatt 1726 68 2178 DNA Homo sapiens 68 gagcagaggg gagacggccg ccgccctggc cgcttccacc acagtttgaa gaaaacaggt 60 ctgaaacaag gtcttacccc cagctgcttc tgaacacagt gactgccaga tctccaaaca 120 tcaagtccag ctttgtccgc caacctgtct gacatgtcgg gacccgtgcc aagcagggcc 180 agagtttaca cagatgttaa tacacacaga cctcgagaat actgggatta cgagtcacat 240 gtggtggaat ggggaaatca agatgactac cagctggttc gaaaattagg ccgaggtaaa 300 tacagtgaag tatttgaagc catcaacatc acaaataatg aaaaagttgt tgttaaaatt 360 ctcaagccag taaaaaagaa gaaaattaag cgtgaaataa agattttgga gaatttgaga 420 ggaggtccca acatcatcac actggcagac attgtaaaag accctgtgtc acgaaccccc 480 gccttggttt ttgaacacgt aaacaacaca gacttcaagc aattgtacca gacgttaaca 540 gactatgata ttcgatttta catgtatgag attctgaagg ccctggatta ttgtcacagc 600 atgggaatta tgcacagaga tgtcaagccc cataatgtca tgattgatca tgagcacaga 660 aagctacgac taatagactg gggtttggct gagttttatc atcctggcca agaatataat 720 gtccgagttg cttcccgata cttcaaaggt cctgagctac ttgtagacta tcagatgtac 780 gattatagtt tggatatgtg gagtttgggt tgtatgctgg caagtatgat ctttcggaag 840 gagccatttt tccatggaca tgacaattat gatcagttgg tgaggatagc caaggttctg 900 gggacagaag atttatatga ctatattgac aaatacaaca ttgaattaga tccacgtttc 960 aatgatatct tgggcagaca ctctcgaaag cgatgggaac gctttgtcca cagtgaaaat 1020 cagcaccttg tcagccctga ggccttggat ttcctggaca aactgctgcg atatgaccac 1080 cagtcacggc ttactgcaag agaggcaatg gagcacccct atttctacac tgttgtgaag 1140 gaccaggctc gaatgggttc atctagcatg ccagggggca gtacgcccgt cagcagcgcc 1200 aatatgatgt cagggatttc ttcagtgcca accccttcac cccttggacc tctggcaggc 1260 tcaccagtga ttgctgctgc caaccccctt gggatgcctg ttccagctgc cgctggcgct 1320 cagcagtaac ggccctatct gtctcctgat gcctgagcag aggtggggga gtccaccctc 1380 tccttgatgc agcttgcgcc tggcggggag gggtgaaaca cttcagaagc accgtgtctg 1440 aaccgttgct tgtggattta tagtagttca gtcataaaaa aaaaaattat aataggctga 1500 ttttcttttt tctttttttt ttaactcgaa cttttcataa ctcaggggat tccctgaaaa 1560 attacctgca ggtggaatat ttcatggaca attttttttt ctcccctccc aaatttagtt 1620 cctcatcaca aaagaacaaa gataaaccag cctcaatccc ggctgctgca tttaggtgga 1680 gacttcttcc cattcccacc attgttcctc caccgtccca cactttaggg ggttggtatc 1740 tcgtgctctt ctccagagat tacaaaaatg tagcttctca ggggaggcag gaagaaagga 1800 aggaaggaaa gaaggaaggg aggacccaat ctataggagc agtggactgc ttgctggtcg 1860 cttacatcac tttactccat aagcgcttca gtggggttat cctagtggct cttgtggaag 1920 tgtgtcttag ttacatcaag atgttaaaat ctacccaaaa tgcagacaga tactaaaact 1980 ctgtcagtag atcatgtctt actgatctaa ccctaaatcc aactcattta tacttttatt 2040 tttagttcag tttaaaatgt tgataccttc cctcccaggc tccttacctt ggtcttttcc 2100 ctgttcatct cccaacatgc tgtgctccat agctggtagg agagggaagg caaaatcttt 2160 cttagttttc tttatctt 2178 69 2016 DNA Homo sapiens 69 cgatgatgat gatgatgtcg ctgtcaccgt ggaccgagac cgcttcatgg atgagttctt 60 tgagcaggtg gaggagattc gaggcttcat tgacaagatc gcagagaacg tggaggaggt 120 gaagcggaag cacagtgcca tcctggcatc ccccaacccc gatgagaaga cgaaggtgga 180 gctggaagaa ctcatgtccg acataaagaa gacagcaaac aaagttcgtt ccaagttaaa 240 gagcatcgag cagtccatcg agcaagagga aggcctgaac cgctcctccg ctgacctgag 300 gatccggaag acacagcact ccacgctgtc cagaaagttt gtggaggtca tgtcggagta 360 caacgccacg cagtccgtct accgcgagcg ctgcaaaggc cgcatccaga ggcagctgga 420 gatcaccggc aggaccacga ccagtgagga gctggaggac atgctggaga gtgggaaccc 480 cgccatcttt gcctctggga tcatcatgga ctccagcatc tcgaagcagg ctctgagcga 540 gattgagacg cggcacagtg agatcatcaa gctggagaac agcatccgtg agctacacga 600 catgttcatg gacatggcca tgctcgtgga gagccaggga gagatgattg acaggatcga 660 gtacaatgtg gaacacgcgg tagactatgt ggagagggcc gtgtctgaca ccaagaaggc 720 cgtcaagtac cagagcaagg cgcgccggaa gaaaatcatg atcatcatct gctgtgtgat 780 cctgggcatc gtcatcgcct ccactgttgg gggcatcttc gcctagaagc cacccaatct 840 gccactccac tccaggtggg ccactccaag gaggccctgg ctgctgccac ctggctgggc 900 tgccctccca acccccgcct ctggctcaga gcaccctccc tcccggcccc catgctccct 960 tctctgccat gggccctccg tccccgcccc gtgtcgtgtg catgatctct gtgagtgtgc 1020 gtctgtacgg gaagaggcag agggaggcag ccagcggggc gtgatgcagt gtgcacagcg 1080 aggagcagac ccaggcaggg ccgccagggt gacacaggcc acgcttcctt gccttcagta 1140 actcggtggg cccaggttct gctcttgccc tggggaccct aacctcgcct ccagctgacc 1200 tgccctgtcc tctccagctg tccccacaag cagagccctg aggggtgggg accagctggc 1260 cacatggtgc tgcttttcag gttaggggag aggtggccct gagggtcagc ccagctctga 1320 gtctcagtcg ctgatcactg ccagggaggc tcaggctgcc atggctccag gctccctccc 1380 ctgcctaggg gcaaagtcca tcgggtcctg ggcctcagct tcccttccca cattcctccg 1440 gccccaggag caaccccttg ggctaggtct gaccccaggt gtccctctgg aaggggctgg 1500 ctggtgccct atttccagcc accccagcag ctagggaggc aaagcaggct gatgtcagtc 1560 cctcaagcca gcgttgcatg tttgggatgg tggctcctgt tgtcttgcgc tctgggaagt 1620 cagatgtcat ttcaggcctg cagtctcatc ctgcccttgc catcctccca tcgatgtgcc 1680 acgtgggtgt cacgtgtccc agatgcagta ttcggcagcc agccggggag ggctacctcc 1740 tcctcctcac caccttgggg cttctcatgg gaaatgtgcc cccgccccag gaccctctcc 1800 cttgtggaca ggcagggaga tgcatgcgag tgcatgcagc aggggatggg gccgtgtccg 1860 tgtgccccac cctccctcgg ctttactcct gcccagtgac tgtgaccact gtccgtgttg 1920 ccttcttgaa cagcgattcc ccccaacccc ttcaccaaag gtcttggtac aaccagctgc 1980 ccattttgtg aaatttttat gtagaataaa caattc 2016 70 1320 DNA Homo sapiens 70 ccccctagcg tcgcgcaggg tcggggactg cgcgcggtgc caggccgggc gtgggcgaga 60 gcacgaacgg gctgctgcgg gctgagagcg tcgagctgtc accatgggtg atcacgcttg 120 gagcttccta aaggacttcc tggccggggc ggtcgccgct gccgtctcca agaccgcggt 180 cgcccccatc gagagggtca aactgctgct gcaggtccag catgccagca aacagatcag 240 tgctgagaag cagtacaaag ggatcattga ttgtgtggtg agaatcccta aggagcaggg 300 cttcctctcc ttctggaggg gtaacctggc caacgtgatc cgttacttcc ccacccaagc 360 tctcaacttc gccttcaagg acaagtacaa gcagctcttc ttagggggtg tggatcggca 420 taagcagttc tggcgctact ttgctggtaa cctggcgtcc ggtggggccg ctggggccac 480 ctccctttgc tttgtctacc cgctggactt tgctaggacc aggttggctg ctgatgtggg 540 caggcgcgcc cagcgtgagt tccatggtct gggcgactgt atcatcaaga tcttcaagtc 600 tgatggcctg agggggctct accagggttt caacgtctct gtccaaggca tcattatcta 660 tagagctgcc tacttcggag tctatgatac tgccaagggg atgctgcctg accccaagaa 720 cgtgcacatt tttgtgagct ggatgattgc ccagagtgtg acggcagtcg cagggctgct 780 gtcctacccc tttgacactg ttcgtcgtag aatgatgatg cagtccggcc ggaaaggggc 840 cgatattatg tacacgggga cagttgactg ctggaggaag attgcaaaag acgaaggagc 900 caaggccttc ttcaaaggtg cctggtccaa tgtgctgaga ggcatgggcg gtgcttttgt 960 attggtgttg tatgatgaga tcaaaaaata tgtctaatgt aattaaaaca caagttcaca 1020 gatttacatg aacttgatct acaagttcac agatccattg tgtggtttaa tagactattc 1080 ctaggggaag taaaaagatc tgggataaaa ccagactgaa aggaatacct cagaagagat 1140 gcttcattga gtgttcatta aaccacacat gtattttgta tttattttac atttaaattc 1200 ccacagcaaa tagaaataat ttatcatact tgtacaatta actgaagaat tgataataac 1260 tgaatgtgaa acatcaataa agaccactta atgcacaaaa aaaaaaaaaa aaaaaaaaaa 1320 71 1806 DNA Homo sapiens 71 gccgctcgct cggctccgct ccctggctcg gctccctgcc tccgcgtcgc agcccccgcc 60 gtagccgcct ccgagcccgc cgccacatcc tctgagaaga tggctgtgcc acccacgtat 120 gccgatcttg gcaaatctgc cagggatgtc ttcaccaagg gctatggatt tggcttaata 180 aagcttgatt tgaaaacaaa atctgagaat ggattggaat ttacaagctc aggctcagcc 240 aacactgaga ccaccaaagt gacgggcagt ctggaaacca agtacagatg gactgagtac 300 ggcctgacgt ttacagagaa atggaatacc gacaatacac taggcaccga gattactgtg 360 gaagatcagc ttgcacgtgg actgaagctg accttcgatt catccttctc acctaacact 420 gggaaaaaaa atgctaaaat caagacaggg tacaagcggg agcacattaa cctgggctgc 480 gacatggatt tcgacattgc tgggccttcc atccggggtg ctctggtgct aggttacgag 540 ggctggctgg ccggctacca gatgaatttt gagactgcaa aatcccgagt gacccagagc 600 aactttgcag ttggctacaa gactgatgaa ttccagcttc acactaatgt gaatgacggg 660 acagagtttg gcggctccat ttaccagaaa gtgaacaaga agttggagac cgctgtcaat 720 cttgcctgga cagcaggaaa cagtaacacg cgcttcggaa tagcagccaa gtatcagatt 780 gaccctgacg cctgcttctc ggctaaagtg aacaactcca gcctgatagg tttaggatac 840 actcagactc taaagccagg tattaaactg acactgtcag ctcttctgga tggcaagaac 900 gtcaatgctg gtggccacaa gcttggtcta ggactggaat ttcaagcata aatgaatact 960 gtacaattgt ttaattttaa actattttgc agcatagcta ccttcagaat ttagtgtatc 1020 ttttaatgtt gtatgtctgg gatgcaagta ttgctaaata tgttagccct ccaggttaaa 1080 gttgattcag ctttaagatg ttacccttcc agaggtacag aagaaaccta tttccaaaaa 1140 aggtcctttc agtggtagac tcggggagaa cttggtggcc cctttgagat gccaggtttc 1200 ttttttatct agaaatggct gcaagtggaa gcggataata tgtaggcact ttgtaaattc 1260 atattgagta aatgaatgaa attgtgattt cctgagaatc gaaccttggt tccctaaccc 1320 taattgatga gaggctcgct gcttgatggt gtgtacaaac tcacctgaat gggacttttt 1380 tagacagatc ttcatgacct gttcccaccc cagttcatca tcatctcttt tacaccaaaa 1440 ggtctgcagg gtgtggtaac tgtttctttt gtgccatttt ggggtggaga aggtggatgt 1500 gatgaagcca ataattcagg acttattcct tcttgtgttg tgtttttttt tggcccttgc 1560 accagagtat gaaatagctt ccaggagctc cagctataag cttggaagtg tctgtgtgat 1620 tgtaatcaca tggtgacaac actcagaatc taaattggac ttctgttgta ttctcaccac 1680 tcaatttgtt ttttagcagt ttaatgggta cattttagag tcttccattt tgttggaatt 1740 agatcctccc cttcaaatgc tgtaattaac aacacttaaa aaacttgaat aaaatattga 1800 aacctc 1806 72 2409 DNA Homo sapiens 72 tgtttttttg ttttggccaa actctcatgg cttttttctc cttccctcat gttttctcct 60 tccctcttaa gacttggcac ttctccagaa ggaggaggac aaaatgacga agtctaagga 120 ggcagtgaca ttcaaggacg tggctgtggt cttctctgag gaggagctgc aactgctgga 180 ccttgcccag aggaagctgt accgagatgt gatgctggag aacttcagga atgtggtctc 240 agtggggcat cagtccacac cagatggcct accacagtta gagagagaag aaaagctgtg 300 gatgatgaag atggcaaccc agagagataa ctcctcagga gccaagaatc taaaagagat 360 ggagactctt caagaagtag gattaaggta cctgcctcat gaagagcttt tctgctccca 420 gatctggcaa cagattacaa gagagttaat caagtatcaa gattctgtgg taaatattca 480 aagaacaggc tgccagttgg aaaaacgaga tgatttgcac tataaagatg agggattcag 540 taatcagagt tcccatcttc aagttcacag agtccacact ggtgaaaaac cctacaaagg 600 agaacattgt gtgaaaagtt tcagctggag ctctcatctt caaattaacc aaagggctca 660 cgcaggagag aagccctaca aatgtgaaaa atgtgataat gccttccgtc ggttttcaag 720 tcttcaagcc catcagagag tccacagtag agcaaaatca tacacaaatg atgcaagtta 780 caggagtttt agtcagaggt cacatcttcc ccatcatcag agagttccca ctggagagaa 840 tccatacaaa tatgaagagt gtgggaggaa tgttgggaaa agctcacatt gtcaagctcc 900 tctgatagtt catacgggag agaaacccta taaatgtgag gagtgtgggg tgggcttcag 960 tcagagatca tatcttcaag ttcatctgaa agttcacact ggaaagaaac catataagtg 1020 tgaagagtgt gggaagagct tcagttggcg ttcacgactg caggctcatg agcgaatcca 1080 cactggcgag aaaccataca aatgcaatgc atgtggcaag agctttagtt acagctcaca 1140 ccttaatatt cattgtagaa tccacacagg agagaaaccc tataagtgtg aggagtgtgg 1200 gaaaggtttc agtgtgggtt cacaccttca ggcccatcag ataagccaca ctggagagaa 1260 gccatacaaa tgtgaggagt gtgggaaagg cttctgccgg gcctcaaatc tgctggacca 1320 tcaaagaggc catactggag agaaaccgta tcagtgtgat gcatgtggta agggcttcag 1380 tcgtagctca gattttaaca ttcattttag agtccataca ggggaaaaac cctataaatg 1440 tgaggagtgt ggcaagggct tcagccaggc ctcaaatctt ctggcccatc aaagaggcca 1500 cactggagag aaaccctaca aatgtggtac atgtgggaag ggcttcagtc ggagctcaga 1560 tcttaatgta cactgtagaa tccacacagg agagaaaccc tataaatgcg agaggtgtgg 1620 taaggccttc agtcagttct ccagccttca ggtgcatcag agagttcaca ctggagagaa 1680 accatatcag tgtgcagagt gtgggaaggg cttcagtgta ggttcacagc ttcaagccca 1740 tcagaggtgc cacactggag agaaacccta tcaatgtgag gagtgtggga agggcttctg 1800 tcgggcctcc aattttctgg cacatcgtgg agtccacaca ggagaaaaac cataccgatg 1860 tgatgtgtgt ggtaagcgct tcagacagag atcctacctt caagcccacc agagggtcca 1920 cacaggagag agaccataca aatgtgagga atgtgggaaa gtcttcagct ggagctcata 1980 ccttcaagcc catcaaagag ttcacaccgg agaaaaacca tacaaatgtg aggagtgtgg 2040 gaagggcttc agttggagct caagtcttat cattcatcag cgagtccatg ctgatgatga 2100 gggtgacaag gactttcctt catcagagga ttcacacagg aaaactcgat aaaatatgtt 2160 ttactatctc agatgggtgc tgaaatattt taataatcag agctatcata gacaaaacat 2220 ttgttttata gagtcagtag ttcagccagt gattgggaga ccacacagca gagaagcctc 2280 acaagagtgg agacatatgg actgcattca gaacattgac cattagctga tacatgcaga 2340 caagaggatc aggaaggatg agtctgatct ggagtaaatc agaagtacta agatagaaat 2400 gctgaattc 2409 73 42999 DNA Homo sapiens misc_feature (1)...(42999) n = A,T,C or G 73 gctgacacgc tgtcctctgg cgacctgtcg tcggagaggt tgggcctccg gatgcgcgcg 60 gggctctggc ctcacggtga ccggctagcc ggccgcgctc ctgccttgag ccgcctgccg 120 cggcccgcgg gcctgctgtt ctctcgcgcg tccgagcgtc ccgactcccg gtgccggccc 180 gggtccgggt ctctgaccca cccgggggcg gcggggaagg cggcgagggc caccgtgccc 240 cgtgcgctct ccgctgcggg cgcccggggc gccgcacaac cccacccgct ggctccgtgc 300 cgtgcgtgtc aggcgttctc gtctccgcgg ggttgtccgc cgccccttcc ccggagtggg 360 gggtggccgg agccgatcgg ctcgctggcc ggccggcctc cgctcccggg gggctcttcg 420 atcgatgtgg tgacgtcgtg ctctcccggg ccgggtccga gccgcgacgg gcgaggggcg 480 gacgttcgtg gcgaacggga ccgtccttct cgctccgccc gcgcggtccc ctcgtctgct 540 cctctccccg cccgccggcc ggcgtgtggg aaggcgtggg gtgcggaccc cggcccgacc 600 tcgccgtccc gcccgccgcc ttcgcttcgc gggtgcgggc cggcggggtc ctctgacgcg 660 gcagacagcc ctgcctgtcg cctccagtgg ttgtcgactt gcgggcggcc cccctccgcg 720 gcggtggggg tgccgtcccg ccggcccgtc gtgctgccct ctcggggggg gtttgcgcga 780 gcgtcggctc cgcctgggcc cttgcggtgc tcctggagcg ctccgggttg tccctcaggt 840 gcccgaggcc gaacggtggt gtgtcgttcc cgcccccggc gccccctcct ccggtcgccg 900 ccgcggtgtc cgcgcgtggg tcctgaggga gctcgtcggt gtggggttcg aggcggtttg 960 agtgagacga gacgagacgc gcccctccca cgcggggaag ggcgcccgcc tgctctcggt 1020 gagcgcacgt cccgtgctcc cctctggcgg gtgcgcgcgg gccgtgtgag cgatcgcggt 1080 gggttcgggc cggtgtgacg cgtgcgccgg ccggccgccg aggggctgcc gttctgcctc 1140 cgaccggtcg tgtgtgggtt gacttcggag gcgctctgcc tcggaaggaa ggaggtgggt 1200 ggacgggggg gcctggtggg gttgcgcgca cgcgcgcacc ggccgggccc ccgccctgaa 1260 cgcgaacgct cgaggtggcc gcgcgcaggt gtttcctcgt accgcagggc cccctccctt 1320 ccccaggcgt ccctcggcgc ctctgcgggc ccgaggagga gcggctggcg ggtgggggga 1380 gtgtgaccca ccctcggtga gaaaagcctt ctctagcgat ctgagaggcg tgccttgggg 1440 gtaccggatc ccccgggccg ccgcctctgt ctctgcctcc gttatggtag cgctgccgta 1500 gcgacccgct cgcagaggac cctcctccgc ttccccctcg acggggttgg gggggagaag 1560 cgagggttcc gccggccacc gcggtggtgg ccgagtgcgg ctcgtcgcct actgtggccc 1620 gcgcctcccc cttccgagtc gggggaggat cccgccgggc cgggcccggc gctcccaccc 1680 agcgggttgg gacgcggcgg ccggcgggcg gtgggtgtgc gcgcccggcg ctctgtccgg 1740 cgcgtgaccc cctccgtccg cgagtcggct ctccgcccgc tcccgtgccg agtcgtgacc 1800 ggtgccgacg accgcgtttg cgtggcacgg ggtcgggccc gcctggccct gggaaagcgt 1860 cccacggtgg gggcgcgccg gtctcccgga gcgggaccgg gtcggaggat ggacgagaat 1920 cacgagcgac ggtggtggtg gcgtgtcggg ttcgtggctg cggtcgctcc ggggcccccg 1980 gtggcggggc cccggggctc gcgaggcggt tctcggtggg ggccgagggc cgtccggcgt 2040 cccaggcggg gcgccgcggg accgccctcg tgtctgtggc ggtgggatcc cgcggccgtg 2100 ttttcctggt ggcccggccg tgcctgaggt ttctccccga gccgccgcct ctgcgggctc 2160 ccgggtgccc ttgccctcgc ggtccccggc cctcgcccgt ctgtgccctc ttccccgccc 2220 gccgcccgcc gatcctcttc ttccccccga gcggctcacc ggcttcacgt ccgttggtgg 2280 ccccgcctgg gaccgaaccc ggcaccgcct cgtggggcgc cgccgccggc cactgatcgg 2340 cccggcgtcc gcgtcccccg gcgcgcgcct tggggaccgg gtcggtggcg cgccgcgtgg 2400 ggcccggtgg gcttcccgga gggttccggg ggtcggcctg cggcgcgtgc gggggaggag 2460 acggttccgg gggaccggcc gcggctgcgg cggcggcggt ggtgggggga gccgcgggga 2520 tcgccgaggg ccggtcggcc gccccgggtg ccccgcggtg ccgccggcgg cggtgaggcc 2580 ccgcgcgtgt gtcccggctg cggtcggccg cgctcgaggg gtccccgtgg cgtccccttc 2640 cccgccggcc gcctttctcg cgccttcccc gtcgccccgg cctcgcccgt ggtctctcgt 2700 cttctcccgg cccgctcttc cgaaccgggt cggcgcgtcc cccgggtgcg cctcgcttcc 2760 cgggcctgcc gcggcccttc cccgaggcgt ccgtcccggg cgtcggcgtc ggggagagcc 2820 cgtcctcccc gcgtggcgtc gccccgttcg gcgcgcgcgt gcgcccgagc gcggcccggt 2880 ggtccctccc ggacaggcgt tcgtgcgacg tgtggcgtgg gtcgacctcc gccttgccgg 2940 tcgctcgccc tctccccggg tcggggggtg gggcccgggc cggggcctcg gccccggtcg 3000 ctgcctcccg tcccgggcgg gggcgggcgc gccggccggc ctcggtcgcc ctcccttggc 3060 cgtcgtgtgg cgtgtgccac ccctgcgccg gcgcccgccg gcggggctcg gagccgggct 3120 tcggccgggc cccgggccct cgaccggacc ggctgcgcgg gcgctgcggc cgcacggcgc 3180 gactgtcccc gggccgggca ccgcggtccg cctctcgctc gccgcccgga cgtcggggcc 3240 gccccgcggg gcgggcggag cgccgtcccc gcctcgccgc cgcccgcggg cgccggccgc 3300 gcgcgcgcgc gcgtggccgc cggtccctcc cggccgccgg gcgcgggtcg ggccgtccgc 3360 ctcctcgcgg gcgggcgcga cgaagaagcg tcgcgggtct gtggcgcggg gcccccggtg 3420 gtcgtgtcgc gtggggggcg ggtggttggg gcgtccggtt cgccgcgccc cgccccggcc 3480 ccaccggtcc cggccgccgc ccccgcgccc gctcgctccc tcccgtccgc ccgtccgcgg 3540 cccgtccgtc cgtccgtccg tcgtcctcct cgcttgcggg gcgccgggcc cgtcctcgcg 3600 aggccccccg gccggccgtc cggccgcgtc gggggctcgc cgcgctctac cttacctacc 3660 tggttgatcc tgccagtagc atatgcttgt ctcaaagatt aagccatgca tgtctaagta 3720 cgcacggccg gtacagtgaa actgcgaatg gctcattaaa tcagttatgg ttcctttggt 3780 cgctcgctcc tctcctactt ggataactgt ggtaattcta gagctaatac atgccgacgg 3840 gcgctgaccc ccttcgcggg ggggatgcgt gcatttatca gatcaaaacc aacccggtca 3900 gcccctctcc ggccccggcc ggggggcggg cgccggcggc tttggtgact ctagataacc 3960 tcgggccgat cgcacgcccc ccgtggcggc gacgacccat tcgaacgtct gccctatcaa 4020 ctttcgatgg tagtcgccgt gcctaccatg gtgaccacgg gtgacgggga atcagggttc 4080 gattccggag agggagcctg agaaacggct accacatcca aggaaggcag caggcgcgca 4140 aattacccac tcccgacccg gggaggtagt gacgaaaaat aacaatacag gactctttcg 4200 aggccctgta attggaatga gtccacttta aatcctttaa cgaggatcca ttggagggca 4260 agtctggtgc cagcagccgc ggtaattcca gctccaatag cgtatattaa agttgctgca 4320 gttaaaaagc tcgtagttgg atcttgggag cgggcgggcg gtccgccgcg aggcgagcca 4380 ccgcccgtcc ccgccccttg cctctcggcg ccccctcgat gctcttagct gagtgtcccg 4440 cggggcccga agcgtttact ttgaaaaaat tagagtgttc aaagcaggcc cgagccgcct 4500 ggataccgca gctaggaata atggaatagg accgcggttc tattttgttg gttttcggaa 4560 ctgaggccat gattaagagg gacggccggg ggcattcgta ttgcgccgct agaggtgaaa 4620 ttcttggacc ggcgcaagac ggaccagagc gaaagcattt gccaagaatg ttttcattaa 4680 tcaagaacga aagtcggagg ttcgaagacg atcagatacc gtcgtagttc cgaccataaa 4740 cgatgccgac cggcgatgcg gcggcgttat tcccatgacc cgccgggcag cttccgggaa 4800 accaaagtct ttgggttccg gggggagtat ggttgcaaag ctgaaactta aaggaattga 4860 cggaagggca ccaccaggag tggagcctgc ggcttaattt gactcaacac gggaaacctc 4920 acccggcccg gacacggaca ggattgacag attgatagct ctttctcgat tccgtgggtg 4980 gtggtgcatg gccgttctta gttggtggag cgatttgtct ggttaattcc gataacgaac 5040 gagactctgg catgctaact agttacgcga cccccgagcg gtcggcgtcc cccaacttct 5100 tagagggaca agtggcgttc agccacccga gattgagcaa taacaggtct gtgatgccct 5160 tagatgtccg gggctgcacg cgcgctacac tgactggctc agcgtgtgcc taccctacgc 5220 cggcaggcgc gggtaacccg ttgaacccca ttcgtgatgg ggatcgggga ttgcaattat 5280 tccccatgaa cgagggaatt cccgagtaag tgcgggtcat aagcttgcgt tgattaagtc 5340 cctgcccttt gtacacaccg cccgtcgcta ctaccgattg gatggtttag tgaggccctc 5400 ggatcggccc cgccggggtc ggcccacggc cctggcggag cgctgagaag acggtcgaac 5460 ttgactatct agaggaagta aaagtcgtaa caaggtttcc gtaggtgaac ctgcggaagg 5520 atcattaacg gagcccggag ggcgaggccc gcggcggcgc cgccgccgcc gcgcgcttcc 5580 ctccgcacac ccaccccccc accgcgacgc ggcgcgtgcg cgggcggggc ccgcgtgccc 5640 gttcgttcgc tcgctcgttc gttcgccgcc cggccccgcc gccgcgagag ccgagaactc 5700 gggagggaga cgggggggag agagagagag agagagagag agagagagag agagagagaa 5760 agaagggcgt gtcgttggtg tgcgcgtgtc gtggggccgg cgggcggcgg ggagcggtcc 5820 ccggccgcgg ccccgacgac gtgggtgtcg gcgggcgcgg gggcggttct cggcggcgtc 5880 gcggcgggtc tgggggggtc tcggtgccct cctccccgcc ggggcccgtc gtccggcccc 5940 gccgcgccgg ctccccgtct tcggggccgg ccggattccc gtcgcctccg ccgcgccgct 6000 ccgcgccgcc gggcacggcc ccgctcgctc tccccggcct tcccgctagg gcgtctcgag 6060 ggtcgggggc cggacgccgg tcccctcccc cgcctcctcg tccgcccccc cgccgtccag 6120 gtacctagcg cgttccggcg cggaggttta aagacccctt ggggggatcg cccgtccgcc 6180 cgtgggtcgg gggcggtggt gggcccgcgg gggagtcccg tcgggagggg cccggcccct 6240 cccgcgcctc caccgcggac tccgctcccc ggccggggcc gcgccgccgc cgccgccgcg 6300 gcggccgtcg ggtgggggct ttacccggcg gccgtcgcgc gcctgccgcg cgtgtggcgt 6360 gcgccccgcg ccgtgggggc gggaaccccc gggcgcctgt ggggtggtgt ccgcgctcgc 6420 ccccgcgtgg gcggcgcgcg cctccccgtg gtgtgaaacc ttccgacccc tctccggagt 6480 ccggtcccgt ttgctgtctc gtctggccgg cctgaggcaa ccccctctcc tcttgggcgg 6540 ggggggcggg gggacgtgcc gcgccaggaa gggcctcctc ccggtgcgtc gtcgggagcg 6600 ccctcgccaa atcgacctcg tacgactctt agcggtggat cactcggctc gtgcgtcgat 6660 gaagaacgca gctagctgcg agaattaatg tgaattgcag gacacattga tcatcgacac 6720 ttcgaacgca cttgcggccc cgggttcctc ccggggctac gcctgtctga gcgtcgcttg 6780 ccgatcaatc gccccggggg tgcctccggg ctcctcgggg tgcgcggctg ggggttccct 6840 cgcagggccc gccgggggcc ctccgtcccc ctaagcgcag acccggcggc gtccgccctc 6900 ctcttgccgc cgcgcccgcc ccttccccct ccccccgcgg gccctgcgtg gtcacgcgtc 6960 gggtggcggg ggggagaggg gggcgcgccc ggctgagaga gacggggagg gcggcgccgc 7020 cgccggaaga cggagaggga aagagagagc cggctcgggc cgagttcccg tggccgccgc 7080 ctgcggtccg ggttcctccc tcggggggct ccctcgcgcc gcgcgcggct cggggttcgg 7140 ggttcgtcgg ccccggccgg gtggaaggtc ccgtgcccgt cgtcgtcgtc gtcgcgcgtc 7200 gtcggcggtg ggggcgtgtt gcgtgcggtg tggtggtggg ggaggaggaa ggcgggtccg 7260 gaaggggaag ggtgccggcg gggagagagg gtcgggggag cgcgtcccgg tcgccgcggt 7320 tccgccgccc gcccccggtg gcggcccggc gtccggccga ccggccgctc cccgcgcccc 7380 tcctcctccc cgccgcccct cctccgaggc cccgcccgtc ctcctcgccc tccccgcgcg 7440 tacgcgcgcg cgcccgcccg cccggctcgc ctcgcggcgc gtcggccggg gccgggagcc 7500 cgccccgccg cccgcccgtg gccgcggcgc cggggttcgc gtgtccccgg cggcgacccg 7560 cgggacgccg cggtgtcgtc cgccgtcgcg cgcccgcctc cggctcgcgg ccgcgccgcg 7620 ccgcgccggg gccccgtccc gagcttccgc gtcggggcgg cgcggctccg ccgccgcgtc 7680 ctcggacccg tccccccgac ctccgcgggg gagacgcgcc ggggcgtgcg gcgcccgtcc 7740 cgcccccggc ccgtgcccct ccctccggtc gtcccgctcc ggcggggcgg cgcgggggcg 7800 ccgtcggccg cgcgctctct ctcccgtcgc ctctccccct cgccgggccc gtctcccgac 7860 ggagcgtcgg gcgggcggtc gggccggcgc gattccgtcc gtccgtccgc cgagcggccc 7920 gtccccctcc gagacgcgac ctcagatcag acgtggcgac ccgctgaatt taagcatatt 7980 agtcagcgga ggaaaagaaa ctaaccagga ttccctcagt aacggcgagt gaacagggaa 8040 gagcccagcg ccgaatcccc gccccgcggg gcgcgggaca tgtggcgtac ggaagacccg 8100 ctccccggcg ccgctcgtgg ggggcccaag tccttctgat cgaggcccag cccgtggacg 8160 gtgtgaggcc ggtagcggcc ggcgcgcgcc cgggtcttcc cggagtcggg ttgcttggga 8220 atgcagccca aagcgggtgg taaactccat ctaaggctaa ataccggcac gagaccgata 8280 gtcaacaagt accgtaaggg aaagttgaaa agaactttga agagagagtt caagagggcg 8340 tgaaaccgtt aagaggtaaa cgggtggggt ccgcgcagtc cgcccggagg attcaacccg 8400 gcggcgggtc cggccgtgtc ggcggcccgg cggatctttc ccgccccccg ttcctcccga 8460 cccctccacc cgccctccct tcccccgccg cccctcctcc tcctccccgg agggggcggg 8520 ctccggcggg tgcgggggtg ggcgggcggg gccgggggtg gggtcggcgg gggaccgtcc 8580 cccgaccggc gaccggccgc cgccgggcgc atttccaccg cggcggtgcg ccgcgaccgg 8640 ctccgggacg gctgggaagg cccggcgggg aaggtggctc ggggggcccc gtccgtccgt 8700 ccgtcctcct cctcccccgt ctccgccccc cggccccgcg tcctccctcg ggagggcgcg 8760 cgggtcgggg cggcggcggc ggcggcggtg gcggcggcgg cgggggcggc gggaccgaaa 8820 ccccccccga gtgttacagc ccccccggca gcagcactcg ccgaatcccg gggccgaggg 8880 agcgagaccc gtcgccgcgc tctcccccct cccggcgccc acccccgcgg ggaatccccc 8940 gcgagggggg tctcccccgc gggggcgcgc cggcgtctcc tcgtgggggg gccgggccac 9000 ccctcccacg gcgcgaccgc tctcccaccc ctcctccccg cgcccccgcc ccggcgacgg 9060 ggggggtgcc gcgcgcgggt cggggggcgg ggcggactgt ccccagtgcg ccccgggcgg 9120 gtcgcgccgt cgggcccggg ggaggttctc tcggggccac gcgcgcgtcc cccgaagagg 9180 gggacggcgg agcgagcgca cggggtcggc ggcgacgtcg gctacccacc cgacccgtct 9240 tgaaacacgg accaaggagt ctaacacgtg cgcgagtcgg gggctcgcac gaaagccgcc 9300 gtggcgcaat gaaggtgaag gccggcgcgc tcgccggccg aggtgggatc ccgaggcctc 9360 tccagtccgc cgagggcgca ccaccggccc gtctcgcccg ccgcgccggg gaggtggagc 9420 acgagcgcac gtgttaggac ccgaaagatg gtgaactatg cctgggcagg gcgaagccag 9480 aggaaactct ggtggaggtc cgtagcggtc ctgacgtgca aatcggtcgt ccgacctggg 9540 tataggggcg aaagactaat cgaaccatct agtagctggt tccctccgaa gtttccctca 9600 ggatagctgg cgctctcgca gacccgacgc acccccgcca cgcagtttta tccggtaaag 9660 cgaatgatta gaggtcttgg ggccgaaacg atctcaacct attctcaaac tttaaatggg 9720 taagaagccc ggctcgctgg cgtggagccg ggcgtggaat gcgagtgcct agtgggccac 9780 ttttggtaag cagaactggc gctgcgggat gaaccgaacg ccgggttaag gcgcccgatg 9840 ccgacgctca tcagacccca gaaaaggtgt tggttgatat agacagcagg acggtggcca 9900 tggaagtcgg aatccgctaa ggagtgtgta acaactcacc tgccgaatca actagccctg 9960 aaaatggatg gcgctggagc gtcgggccca tacccggccg tcgccggcag tcgagagtgg 10020 acgggagcgg cgggggcggc gcgcgcgcgc gcgcgtgtgg tgtgcgtcgg agggcggcgg 10080 cggcggcggc ggcgggggtg tggggtcctt cccccgcccc cccccccacg cctcctcccc 10140 tcctcccgcc cacgccccgc tccccgcccc cggagccccg cggacgctac gccgcgacga 10200 gtaggagggc cgctgcggtg agccttgaag cctagggcgc gggcccgggt ggagccgccg 10260 caggtgcaga tcttggtggt agtagcaaat attcaaacga gaactttgaa ggccgaagtg 10320 gagaagggtt ccatgtgaac agcagttgaa catgggtcag tcggtcctga gagatgggcg 10380 agcgccgttc cgaagggacg ggcgatggcc tccgttgccc tcggccgatc gaaagggagt 10440 cgggttcaga tccccgaatc cggagtggcg gagatgggcg ccgcgaggcg tccagtgcgg 10500 taacgcgacc gatcccggag aagccggcgg gagccccggg gagagttctc ttttctttgt 10560 gaagggcagg gcgccctgga atgggttcgc cccgagagag gggcccgtgc cttggaaagc 10620 gtcgcggttc cggcggcgtc cggtgagctc tcgctggccc ttgaaaatcc gggggagagg 10680 gtgtaaatct cgcgccgggc cgtacccata tccgcagcag gtctccaagg tgaacagcct 10740 ctggcatgtt ggaacaatgt aggtaaggga agtcggcaag ccggatccgt aacttcggga 10800 taaggattgg ctctaagggc tgggtcggtc gggctggggc gcgaagcggg gctgggcgcg 10860 cgccgcggct ggacgaggcg cgcgcccccc ccacgcccgg ggcacccccc tcgcggccct 10920 cccccgcccc acccgcgcgc gccgctcgct ccctccccac cccgcgccct ctctctctct 10980 ctctcccccg ctccccgtcc tcccccctcc ccgggggagc gccgcgtggg ggcgcggcgg 11040 ggggagaagg gtcggggcgg caggggccgc gcggcggccg ccggggcggc cggcgggggc 11100 aggtccccgc gaggggggcc ccggggaccc ggggggccgg cggcggcgcg gactctggac 11160 gcgagccggg cccttcccgt ggatcgcccc agctgcggcg ggcgtcgcgg ccgcccccgg 11220 ggagcccggc ggcggcgcgg cgcgcccccc acccccaccc cacgtctcgg tcgcgcgcgc 11280 gtccgctggg ggcgggagcg gtcgggcggc ggcggtcggc gggcggcggg gcggggcggt 11340 tcgtcccccc gccctacccc cccggccccg tccgcccccc gttcccccct cctcctcggc 11400 gcgcggcggc ggcggcggca ggcggcggag gggccgcggg ccggtccccc ccgccgggtc 11460 cgcccccggg gccgcggttc cgcgcgcgcc tcgcctcggc cggcgcctag cagccgactt 11520 agaactggtg cggaccaggg gaatccgact gtttaattaa aacaaagcat cgcgaaggcc 11580 cgcggcgggt gttgacgcga tgtgatttct gcccagtgct ctgaatgtca aagtgaagaa 11640 attcaatgaa gcgcgggtaa acggcgggag taactatgac tctcttaagg tagccaaatg 11700 cctcgtcatc taattagtga cgcgcatgaa tggatgaacg agattcccac tgtccctacc 11760 tactatccag cgaaaccaca gccaagggaa cgggcttggc ggaatcagcg gggaaagaag 11820 accctgttga gcttgactct agtctggcac ggtgaagaga catgagaggt gtagaataag 11880 tgggaggccc ccggcgcccc cccggtgtcc ccgcgagggg cccggggcgg ggtccgcggc 11940 cctgcgggcc gccggtgaaa taccactact ctgatcgttt tttcactgac ccggtgaggc 12000 gggggggcga gcccgagggg ctctcgcttc tggcgccaag cgcccgcccg gccgggcgcg 12060 acccgctccg gggacagtgc caggtgggga gtttgactgg ggcggtacac ctgtcaaacg 12120 gtaacgcagg tgtcctaagg cgagctcagg gaggacagaa acctcccgtg gagcagaagg 12180 gcaaaagctc gcttgatctt gattttcagt acgaatacag accgtgaaag cggggcctca 12240 cgatccttct gaccttttgg gttttaagca ggaggtgtca gaaaagttac cacagggata 12300 actggcttgt ggcggccaag cgttcatagc gacgtcgctt tttgatcctt cgatgtcggc 12360 tcttcctatc attgtgaagc agaattcgcc aagcgttgga ttgttcaccc actaataggg 12420 aacgtgagct gggtttagac cgtcgtgaga caggttagtt ttaccctact gatgatgtgt 12480 tgttgccatg gtaatcctgc tcagtacgag aggaaccgca ggttcagaca tttggtgtat 12540 gtgcttggct gaggagccaa tggggcgaag ctaccatctg tgggattatg actgaacgcc 12600 tctaagtcag aatcccgccc aggcgaacga tacggcagcg ccgcggagcc tcggttggcc 12660 tcggatagcc ggtcccccgc ctgtccccgc cggcgggccg cccccccctc cacgcgcccc 12720 gccgcgggag ggcgcgtgcc ccgccgcgcg ccgggaccgg ggtccggtgc ggagtgccct 12780 tcgtcctggg aaacggggcg cggccggaaa ggcggccgcc ccctcgcccg tcacgcaccg 12840 cacgttcgtg gggaacctgg cgctaaacca ttcgtagacg acctgcttct gggtcggggt 12900 ttcgtacgta gcagagcagc tccctcgctg cgatctattg aaagtcagcc ctcgacacaa 12960 gggtttgtcc gcgcgcgcgt gcgtgcgggg ggcccggcgg gcgtgcgcgt tcggcgccgt 13020 ccgtccttcc gttcgtcttc ctccctcccg gcctctcccg ccgaccgcgg cgtggtggtg 13080 gggtgggggg gagggcgcgc gaccccggtc ggccgccccg cttcttcggt tcccgcctcc 13140 tccccgttca cgccggggcg gctcgtccgc tccgggccgg gacggggtcc ggggagcgtg 13200 gtttgggagc cgcggaggcg ccgcgccgag ccgggccccg tggcccgccg gtccccgtcc 13260 cgggggttgg ccgcgcggcg cggtgggggg ccacccgggg tcccggccct cgcgcgtcct 13320 tcctcctcgc tcctccgcac gggtcgaccg acgaaccgcg ggtggcgggc ggcgggcggc 13380 gagccccacg ggcgtccccg cacccggccg acctccgctc gcgacctctc ctcggtcggg 13440 cctccggggt cgaccgcctg cgcccgcggg cgtgagactc agcggcgtct cgccgtgtcc 13500 cgggtcgacc gcggccttct ccaccgagcg gcggtgtagg agtgcccgtc gggacgaacc 13560 gcaaccggag cgtccccgtc tcggtcggca cctccggggt cgaccagctg ccgcccgcga 13620 gctccggact tagccggcgt ctgcacgtgt cccgggtcga ccagcaggcg gccgccggac 13680 gcagcggcgc acgcacgcga gggcgtcgat tccccttcgc gcgcccgcgc ctccaccggc 13740 ctcggcccgc ggtggagctg ggaccacgcg gaactccctc tcccacattt ttttcagccc 13800 caccgcgagt ttgcgtccgc gggaccttta agagggagtc actgctgccg tcagccagta 13860 ctgcctcctc ctttttcgct tttaggtttt gcttgccttt tttttttttt tttttttttt 13920 ttttttcttt ctttctttct ttctttcttt ctttctttct ttctttcttt cgcttgtctt 13980 cttcttgtgt tctcttcttg ctcttcctct gtctgtctct ctctctctct ctctctctgt 14040 ctctcgctct cgccctctct ctcttctctc tctctctctc tctctctctg tctctcgctc 14100 tcgccctctc tctctctctt ctctctgtct ctctctctct ctctctctct ctctctctct 14160 gtcgctctcg ccctctcgct ctctctctgt ctctgtctgt gtctctctct ctccctccct 14220 ccctccctcc ctccctccct ccctcccctt ccttggcgcc ttctcggctc ttgagactta 14280 gccgctgtct cgccgtaccc cgggtcgacc ggcgggcctt ctccaccgag cggcgtgcca 14340 cagtgcccgt cgggacgagc cggacccgcc gcgtccccgt ctcggtcggc acctccgggg 14400 tcgaccagct gccgcccgcg agctccggac ttagccggcg tctgcacgtg tcccgggtcg 14460 accagcaggc ggccgccgga cgcagcggcg caccgacgga gggcgctgat tcccgttcac 14520 gcgcccgcgc ctccaccggc ctcggcccgc cgtggagctg ggaccacgcg gaactccctc 14580 tcctacattt ttttcagccc caccgcgagt ttgcgtccgc gggaccttta agagggagtc 14640 actgctgccg tcagccagta ctgcctcctc ctttttcgct tttaggtttt gcttgccttt 14700 tttttttttt tttttttttt ttttttcttt ctttctttct ttctttcttt ctttctttct 14760 ttctttcttt ctttcgctct cgctctctcg ctctctccct cgctcgtttc tttctttctc 14820 tttctctctc tctctctctc tctctctctc tctgtctctc gctctcgccc tctctctctc 14880 tttctctctc tctctgtctc tctctctctc tctctctctc tctctctctc cctccctccc 14940 tccccctccc tccctctctc cccttccttg gcgccttctc ggctcttgag acttagccgc 15000 tgtctcgccg tgtcccgggt cgaccggcgg gccttctcca ccgagcggcg tgccacagtg 15060 cccgtcggga cgagccggac ccgccgcgtc cccgtctcgg tcggcacctc cggggtcgac 15120 cagctgccgc ccgcgagctc cggacttagc cggcgtctgc acgtgtcccg ggtcgaccag 15180 caggcggccg ccggacgctg cggcgcaccg acgcgagggc gtcgattccg gttcacgcgc 15240 cggcgacctc caccggcctc ggcccgcggt ggagctggga ccacgcggaa ctccctctcc 15300 cacatttttt tcagccccac cgcgagtttg cgtccgcggg acttttaaga gggagtcact 15360 gctgccgtca gccagtaatg cttcctcctt ttttgctttt tggttttgcc ttgcgttttc 15420 tttctttctt tctttctttc tttctttctt tctttctttc tctctctctc tctctctctc 15480 tctctgtctc tctctctctg tctctctccc ctccctccct ccttggtgcc ttctcggctc 15540 gctgctgctg ctgcctctgc ctccacggtt caagcaaaca gcaagttttc tatttcgagt 15600 aaagacgtaa tttcaccatt ttggccgggc tggtctcgaa ctcccgacct agtgatccgc 15660 ccgcctcggc ctcccaaaga ctgctgggag tacagatgtg agccaccatg cccggccgat 15720 tccttccttt tttcaatctt attttctgaa cgctgccgtg tatgaacata catctacaca 15780 cacacacaca cacacacaca cacacacaca cacacacaca cacacacccc gtagtgataa 15840 aactatgtaa atgatatttc cataattaat acgtttatat tatgttactt ttaatggatg 15900 aatatgtatc gaagccccat ttcatttaca tacacgtgta tgtatatcct tcctcccttc 15960 cttcattcat tatttattaa taattttcgt ttatttattt tcttttcttt tggggccggc 16020 ccgcctggtc ttctgtctct gcgctctggt gacctcagcc tcccaaatag ctgggactac 16080 agggatctct taagcccggg aggagaggtt aacgtgggct gtgatcgcac acttccactc 16140 cagcttacgt gggctgcggt gcggtggggt ggggtggggt ggggtggggt gcagagaaaa 16200 cgattgattg cgatctcaat tgccttttag cttcattcat accctgttat ttgctcgttt 16260 attctcatgg gttcttctgt gtcattgtca cgttcatcgt ttgcttgcct gcttgcctgt 16320 ttatttcctt ccttccttcc ttccttcctt ccttccttcc ttccttcctt ccctccctta 16380 ctggcagggt cttcctctgt ctctgccgcc caggatcacc ccaacctcaa cgctttggac 16440 cgaccaaacg gtcgttctgc ctctgatccc tcccatcccc attacctgag actacaggcg 16500 cgcaccacca caccggctga cttttatgtt gtttctcatg ttttccgtag gtaggtatgt 16560 gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtatct 16620 atgtatgtac gtatgtatgt atgtatgtga gtgagatggg tttcggggtt ctatcatgtt 16680 gcccacgctg gtctcgaact cctgtcctca agcaatccgc ctgcctgcct cggccgccca 16740 cactgctgct attacaggcg tgagacgctg cgcctggctc cttctacatt tgcctgcctg 16800 cctgcctgcc tgcctgccta tcaatcgtct tctttttagt acggatgtcg tctcgcttta 16860 ttgtccatgc tctgggcaca cgtggtctct tttcaaactt ctatgattat tattattgta 16920 ggcgtcatct cacgtgtcga ggtgatctcg aacttttagg ctccagagat cctcccgcat 16980 cggcctcccg gagtgctgtg atgacacgcg tgggcacggt acgctctggt cgtgtttgtc 17040 gtgggtcggt tctttccgtt tttaatacgg ggactgcgaa cgaagaaaat tttcagacgc 17100 atctcaccga tccgcctttt cgttctttct ttttattctc tttagacgga gtttcactct 17160 tgtcgcccag ggtggagtac gatggcggct ctcggctcac cgcaccctcc gcctcccagg 17220 ttcaagtgat tctcctgcct cagccttccc gagtagctgg aatgacagag atgagccatc 17280 gtgcccggct aatttttcta tttttagtac agatggggtt tctccatctt ggtcaggctg 17340 gtcttcaact tccgaccgtt ggagaatctt aactttcttg gtggtggttg ttttcctttt 17400 tctttttttt tcttttcttt tctttccttc tcctcccccc cccacccccc ttgtcgtcgt 17460 cctcctcctc ctcctcctcc tcctcctcct cctcctcctc ctcctcctcc tctttcattt 17520 ctttcagctg ggctctccta cttgtgttgc tctgttgctc acgctggtct caaactcctg 17580 gccttgactc ttctcccgtc acatccgccg tctggttgtt gaaatgagca tctctcgtaa 17640 aatggaaaag atgaaagaaa taaacacgaa gacggaaagc acggtgtgaa cgtttctctt 17700 gccgtctccc ggggtgtacc ttggacccgg aaacacggag ggagcttggc tgagtgggtt 17760 ttcggtgccg aaacctcccg agggcctcct tccctctccc ccttgtcccc gcttctccgc 17820 cagccgaggc tcccaccgcc gcccctggca ttttccatag gagaggtatg ggagaggact 17880 gacacgcctt ccagatctat atcctgccgg acgtctctgg ctcggcgtgc cccaccggct 17940 acctgccacc ttccagggag ctctgaggcg gatgcgaccc ccaccccccc gtcacgtccc 18000 gctaccctcc cccggctggc ctttgccggg cgaccccagg ggaaccgcgt tgatgctgct 18060 tcggatcctc cggcgaagac ttccaccgga tgccccgggt gggccggttg ggatcagact 18120 ggaccacccc ggaccgtgct gttcttgggg gtgggttgac gtacagggtg gactggcagc 18180 cccagcattg taaagggtgc gtgggtatgg aaatgtcacc taggatgccc tccttccctt 18240 cggtctgcct tcagctgcct caggcgtgaa gacaacttcc catcggaacc tcttctcttc 18300 cctttctcca gcacacagat gagacgcacg agagggagaa acagctcaat agataccgct 18360 gaccttcatt tgtggaatcc tcagtcatcg acacacaaga caggtgacta ggcagggaca 18420 cagatcaaac actatttccg ggtcctcgtg gtgggattgg tctctctctc tctctctctc 18480 tctctctctc tctctctctc tctcgcacgc gcacgcgcgc acacacacac acaatttcca 18540 tatctagttc acagagcaca ctcacttccc cttttcacag tacgcaggct gagtaaaacg 18600 cgccccaccc tccacccgtt ggctgacgaa accccttctc tacaattgat gaaaaagatg 18660 atctgggccg ggcacgctag ctcacgcctg tcactccggc actttgggag gccgaggcgg 18720 gtggatcgct tggggccggg agttcgagac caggctggcc gacgtggcga aaccccgtct 18780 ctctgaaaaa tagaacgatt agccgggcct ggtggcgtgg gcttggaatc acgaccgctc 18840 gggagactgg ggcgggcgac ttgttccaac cggggaggcc gaggccgcga tgagctgaga 18900 tcgtgccgtg gcgatgcggc ctggatgacg gagcgagacc ccgtctcgag agaatcatga 18960 tgttattata agatgagttg tgcgcggtga tggccgcctg tagtcgcggc tactcgggag 19020 gctgagacga ggagaagatc acttgaggcc ccacaggtcg aggcttcggt cggccgtgac 19080 ccactgtatc ctgggcagtc accggtcaag gagatatgcc ccttccccgt ttgcttttct 19140 tttcttccct tctcttttct tctttttgct tctcttttct ttctttcttt ctttctttct 19200 ttctttcttt ctttctttct ttttcttttt ctctcttccc ctctttcttt cctgccttcc 19260 tgcctttctt cttttcttct ttcctccctt cctcccttcc ttctttcctc ccgcctcagc 19320 ctcccaaagt gctgggatga ctggcgggag gcaccatgcc tgcttggccc aaagagaccc 19380 tcttggaaag tgagacgcag agagcgcctt ccagtgatct cattgactga tttagagacg 19440 gcatctcgct ccgtcacccc ggcagtggtg ccgtcgtaac tcactccctg cagcgtggac 19500 gctcctggac tcgagcgatc cttccacctc agcctccaga gtacagagcc tgggaccgcg 19560 ggcacgcgcc actgtgccca caccgttttt aattgttttt ttttcccccg agacagagtt 19620 tcactctcgt ggcctagact gcagtgcggt ggcgcgatct tggctcaccg caacctctgc 19680 ctcccggttt caagcgattc tcctgcatcg gcctcctgag tagccgggat tgcgggcatg 19740 cgctgccacg tctggctgat ttcgtatttt tagtggagac ggggcttctc catgtcgatc 19800 gggctggttt cgaactcccg acctcaggtg atccgccctc cccggcctcc ggaagtgctg 19860 ggatgacagg cgtgagccac cgcgcccggc cttcattttt aaatgttttc ccacagacgg 19920 ggtctcatca tttctttgca accctcctgc ccggcgtctc aaagtgctgg cgtgacgggc 19980 gtgagccact gcgcctggac tccggggaat gactcacgac caccatcgct ctactgatcc 20040 tttctttctt tctttctttc tttctttctt tctttctttc tttctttctt tctttcttga 20100 tgaattatct tatgatttat ttgtgtactt attttcagac ggagtctcgc tctgggcggg 20160 gcgaggcgag gcgaggcaca gcgcatcgct ttggaagccg cggcaacgcc tttcaaagcc 20220 ccattcgtat gcacagagcc ttattccctt cctggagttg gagctgatgc cttccgtagc 20280 cttgggcttc tctccattcg gaagcttgac aggcgcaggg ccacccagag gctggctgcg 20340 gctgaggatt agggggtgtg ttggggctga aaactgggtc ccctattttt gatacctcag 20400 ccgacacatc ccccgaccgc catcgcttgc tcgccctctg agatcccccg cctccaccgc 20460 cttgcaggct cacctcttac tttcatttct tcctttcttg cgtttgagga gggggtgcgg 20520 gaatgagggt gtgtgtgggg agggggtgcg gggtggggac ggaggggagc gtcctaaggg 20580 tcgatttagt gtcatgcctc tttcaccacc accaccacca ccgaagatga cagcaaggat 20640 cggctaaata ccgcgtgttc tcatctagaa gtgggaactt acagatgaca gttcttgcat 20700 gggcagaacg agggggaccg gggacgcgga agtctgcttg agggaggagg ggtggaagga 20760 gagacagctt caggaagaaa acaaaacacg aatactgtcg gacacagcac tgactacccg 20820 ggtgatgaaa tcatctgcac actgaacacc cccgtcacaa gtttacctat gtcacaatct 20880 tgcacatgta tcgcttgaac gacaaataaa agttaggggg gagaagagag gagagagaga 20940 gagagagaga gacagagaga gacagagaga gagagagagg agggagagag gaaaacgaaa 21000 caccacctcc ttgacctgag tcagggggtt tctggccttt tgggagaacg ttcagcgaca 21060 atgcagtatt tgggcccgtt cttttttttt cttcttcttt tctttctttt tttttggact 21120 gagtctctct cgctctgtca cccaggctgc ggtcgcggtg gcgctctctc ggctcactga 21180 aacctctgct tcccgggttc cagtgattct tcttcggtag ctgggattac aggcgcacac 21240 catgacggcg ggctcatatt cctattttca gtagagacgg ggtttctcca cgttggccac 21300 gctggtctcg aactcctgac ctcaaatgat ccgccttcct gggcctccca aagtgctgga 21360 aacgacaggc ctgagccgcc gggatttcag cctttaaaag cgcggccctg ccacctttcg 21420 ctgtggccct tacgctcaga atgacgtgtc ctctctgccg taggttgact ccttgagtcc 21480 cctaggccat tgcactgtag cctgggcagc aagagccaaa ctccgnnccc ccacctcctc 21540 gcgcacataa taactaacta acaaactaac taactaacta aactaactaa ctaactaaaa 21600 tctctacacg tcacccataa gtgtgtgttc ccgtgagagt gatttctaag aaatggtact 21660 gtacactgaa cgcagtggct cacgtctgtc atcccgaggt caggagttcg agaccagccc 21720 ggccaacgtg gtgaaacccc gtctctactg aaaatacgaa atggagtcag gcgccgtggg 21780 gcaggcacct gtaaccccag ctactcggga ggctggggtg gaagaattgc ttgaacctgg 21840 caggcggagg ctgcagtgac ccaagatcgc accactgcac tacagcctgg gcgacagagt 21900 gagacccggt ctccagataa atacgtacat aaataaatac acacatacat acatacatac 21960 atacatacat acatacatac atccatgcat acagatatac aagaaagaaa aaaagaaaag 22020 aaaagaaaga gaaaatgaaa gaaaaggcac tgtattgcta ctgggctagg gccttctctc 22080 tgtctgtttc tctctgttcg tctctgtctt tctctctgtg tctctttctc tgtctgtctg 22140 tctctttctt tctctctgtc tctgtctctg tctttgtctc tctctctccc tctctgcctg 22200 tctcactgtg tctgtcttct gtcttactct ctttctctcc ccgtctgtct ctctctctct 22260 ctctccctcc ctgtttgttt ctctctctcc ctccctgtct gtttctctct ctctctttct 22320 gtctgtttct gtctctctct gtctgtctat gtctttctct gtctgtctct ttctctgtct 22380 gtctgcctct ctctttcttt ttctgtgtct ctctgtcggt ctctctctct ctgtctgtct 22440 gtctgtctct ctctctctct ctctgtgcct atcttctgtc ttactctctt tctctgcctg 22500 tctgtctgtc tctccctccc tttctgtttc tctctctctc tctctctctc tccccctctc 22560 cctgtctgtt tctctccgtc tctctctctt tctgtctgtt tctcactgtc tctctctgtc 22620 catctctctc tctctctgtc tgtctctttc gttctctctg tctgtctgtc tctctctctc 22680 tctctctctc tctctctctc tccctgtctg tctgtttctc tctatctctc gctgtccatc 22740 tctgtctttc tatgtctgtc tctttctctg tcagtctgtc agacaccccc gtgccgggta 22800 gggccctgcc ccttccacga aagtgagaag cgcgtgcttc ggtgcttaga gaggccgaga 22860 ggaatctaga caggcgggcc ttgctgggct tccccactcg gtgtatgatt tcgggaggtc 22920 gaggccgggt ccccgcttgg atgcgagggg cattttcaga cttttctctc ggtcacgtgt 22980 ggcgtccgta cttctcctat ttccccgata agctcctcga cttcaacata aacggcgtcc 23040 taagggtcga tttagtgtca tgcctctttc accgccacca ccgaagatga aagcaaagat 23100 cggctaaata ccgcgtgttc tcatctagaa gtgggaactt acagatgaca gttcttgcat 23160 gggcagaacg agggggaccg ggnacgcgga agcctgcttg agggrggagg ggyggaagga 23220 gagacagctt caggaagaaa acaaaacacg aatactgtcg gacacagcac tgactacccg 23280 ggtgatgaaa tcatctgcac actgaacacc cccgtcacaa gtttacctat gtcacagtct 23340 tgctcatgta tgcttgaacg acaaataaaa gttcgggggg gagaagagag gagagagaga 23400 gagagacggg gagagagggg ggagaggggg ggggagagag agagagagag agagagagag 23460 agagagagag agaaagagaa gtaaaaccaa ccaccacctc cttgacctga gtcagggggt 23520 ttctggcctt ttgggagaac gttcagcgac aatgcagtat ttgggcccgt tctttttttc 23580 ttcttcttct tttctttctt tttttttgga ctgagtctct ctcgctctgt cacccaggct 23640 gcggtgcggt ggcgctctct cggctcactg aaacctctgc ttcccgggtt ccagtgattc 23700 ttcttcggta gctgggatta caggtgcgca ccatgacggc cggctcatcg ttctattttt 23760 agtagagacg gggtttctcc acgttggcca cgctggtctc gaactcctga ccacaaatga 23820 tccaccttcc tgggcctccc aaagtgctgg aaacgacagg cctgagccgc cgggatttca 23880 gcctttaaaa gcgcgcggcc ctgccacctt tcgctgcggc ccttacgctc agaatgacgt 23940 gtcctctctg ccataggttg actccttgag tcccctaggc cattgcactg tagcctgggc 24000 agcaagagcc aaactccgtc cccccacctc cccgcgcaca taataactaa ctaactaact 24060 aactaactaa aatctctaca cgtcacccat aagtgtgtgt tcccgtgagg agtgatttct 24120 aagaaatggt actgtacact gaacgcaggc ttcacgtctg tcatcccgag gtcaggagtt 24180 cgagaccagc ccggcccacg tggtgaaacc cccgtctcta ctgaaaatac gaaatggagt 24240 caggcgccgt ggggcaggca cctgtaaccc cagctactcg ggaggctggg gtggaagaat 24300 tgcttgaacc tggcaggcgg aggctgcagt gacccaagat cgcaccactg cactacagcc 24360 tgggcgacag agtgagaccc ggtctccaga taaatacgta cataaataaa tacacacata 24420 catacataca tacatacaac atacatacat acagatatac aagaaagaaa aaaagaaaag 24480 aaaagaaaga gaaaatgaaa gaaaaggcac tgtattgcta ctgggctagg gccttctctc 24540 tgtctgtttc tctctgttcg tctctgtctt tctctctgtg tctctttctc tgtctgtctg 24600 tctgtctgtc tgtctgtctc tttctttctt tctgtctctg tctttgtccc tctctctccc 24660 tctctgccct gtctcactgt gtctgtcttc tatcttactc tctttctctc cccgtctgtc 24720 tctctctcac tccctccctg tctgtttctc tctctctctc tttctgtctg tttctgtctc 24780 tctctgtctg cctctctctt tctctatctg tctctttctc tgtctgtctg cccctctctt 24840 tctttttctg tgtctctctg tctgtctctc tctctctctg tgcctatctt ctgtcttact 24900 ctctttctct gcctgtctgt ctgtctctct ctgtctctcc ctccctttct gcttctctct 24960 ctctctctct ctctnnnccc tccctgtctg tttctctctg tctccctctc tttctgtctg 25020 tttctcactg tctctctctg tctgtctgtt tcattctctc tgtctctgtc tctgtctctc 25080 tctctctctg tctctccctc tctgtgtgta tcttttgtct tactctcctt ctctgcctgt 25140 ccgtctgtct gtctgtctct ctctctccct gtccctctct ctttctgtct gtttctctct 25200 ctctctctct ctctctctct ctgtctctgt ctttctctgt ctgtcccttt ctctgtctgt 25260 ctgcctctct ctttctcttt ctgtgtctct ctgtctctct ctctgtgcct atcttctgtc 25320 ttactctctt tctctgcctg tctatctgtc tgtctctctc tgtctctctc cctgcctttc 25380 tgtttctctc tctctccctc tctcgctctc tctgtctttc tctctttctc tctgtttctc 25440 tgtctctctc tgtccgtctc tgtctttttc tgtctgtctg tctctctctt tctttctgtc 25500 gtctgtctct gtctctgtct ctgtctctct ctctctctct ctccttgtct ctctcactgt 25560 gtctgtcttc tgtcttactc tccttctctg cctgtccatc tgtctgtctg tctctctctc 25620 tctctcccta cctttctgtt tctctctcgc tagctctctc tctctctgcc tgtttctctc 25680 tttctctctc tgtctttctc tgtctgtctc tttctctgtc tgtctgtctc tttctctctg 25740 tctctgtctc tgtctctctc tctctctctc tctctctctc tgcctctctc actgtgtctg 25800 tcttctgtct tattctcttt ctctctctgt ctctctctct ctctccttta ctgtctgttt 25860 ctctctctct ctctctcttt ctgcctgttt ctctctgtct gtctctgtct ttctctgtct 25920 gtctgcctct ctctttcttt ttctgcgtct ctctgtctct ctctctctct ctctgttcct 25980 atcttctgtc ttactctgtt tccttgcctg cctgcctgtc tgtgtgtctg tctctctctc 26040 tctctctctc tctctctccc tccctttctc tttctctgtc tctctctctc tttctgggtg 26100 tttctctctg tctctctgtc catctctgtc tttctatgtc tgtctctctc tttctctctg 26160 tctctgtctc tgcctctctc tctctctctc tctctctctc tctgtctgtc tctctcactg 26220 tgtgtgtctg tcttctgtct tactctcctt ctctgcctgt ccgtctgtct gtctgtctct 26280 ccctctctct ccctcccttt ctgtttctct ctctctctct ttctgtctgt ttctctcttt 26340 ctctctctgt ctgtctcttt ctctgtctgt ctgtctctct ctttcttttt ctctgtctct 26400 ctgtctctct ctgtgtctgt ctctctgtct gtgcctatct tctgtcttac tctctttctc 26460 tggctgtctg cctgtctctc tctctctctc tgtctgtctc cgtccctctc tccctgtctg 26520 tctgtttctc tctctgcctc tctctctctc tgtctgtctc tttctctgtc tgtctgtctc 26580 tctctttctt tttctctgtc tctctgtctc tctctgtgtc tgtctctctt tctgtgccta 26640 tcttctgtct tactctcttt ctctggctgt ctgcctgtct ctctctctct gcctgtctcc 26700 gtccctccct ccctgtctgt ctgtttctct ctctgtctct gtctctctgt ccatctctgt 26760 ctgtctcttt ctctttctct ctctctgtct ctgtctctct ctctctctgc ctgtctctct 26820 cactgtgtct gtcttctgtc ttactctctt tctcttgcct gcctctctgt ctgtctgtct 26880 ctctccctcc atgtctctct ctctctctca ctcactctct ctccgtctct ctctctttct 26940 gtctgtttct ctctctgtct gtctctctcc ctccatgtct ctctctctct ctctcactca 27000 ctctctctcc gtctctctct ctctttctgt ctgtttctct ctctgtctgt ctctctccct 27060 ccatgtctct ctctctccct ctcactcact ctctctccgt ctctctctct ctttctgtct 27120 gtttctttgt ctgtctgtct gtctgtctgt ctgtctctct ctctctctct ctctctctct 27180 ctctctgttt gtctttctcc ctccctgtct gtctgtctgt ctctctctct ctgtctctgt 27240 ctctgtctct ctctctttct ctttctgtct gtttctctct atctctcgct gtccatctct 27300 gtctttctat gtctgtctct ttctctgtca gtctgtcaga cacacccgtg ccggtagggc 27360 cctgcccttc cacgagagtg agaagcgcgt gcttcggtgc ttagagaggc cgagaggaat 27420 ctagacaggc gggccttgct gggcttcccc actcggtgta cgatttcggg aggtcgaggc 27480 cgggtccccg cttggatgcg aggggcattt tcagactttt ctctcggtca cgtgtggcgt 27540 ccgtacttct cctatttccc cgataagtct cctcgacttc aacataaact gttaaggccg 27600 gacgccaaca cggcgaaacc ccgtctctac taaaaataca aagctgagtc gggagcggtg 27660 gggcaggccc tgtaatgcca gctcctcggg aggctgaggc gggagaatcg cttgaaccag 27720 ggaagcggag gctgcaggga gccgagatcg cgccactgca ctacggccca ggctgtagag 27780 tgagtgagac tcggtctcta aataaatacg gaaattaatt aattcattaa ttcttttccc 27840 tgctgacgga catttgcagg caggcatcgg ttgtcttcgg gcatcaccta gcggccactg 27900 ttattgaaag tcgacgttga cacggaggga ggtctcgccg acttcaccga gcctggggca 27960 acgggtttct ctctctccct tctggaggcc cctccctctc tccctcgttg cctagggaac 28020 ctcgcctagg gaacctccgc cctgggggcc ctattgttct ttgatcggcg ctttactttt 28080 ctttgtgttt tggcgcctag actcttctac ttgggctttg ggaagggtca gtttaatttt 28140 caagttgccc cccggctccc cccactaccc acgtcccttc accttaattt agtgagncgg 28200 ttaggtgggt ttcccccaaa ccgccccccc ccccccgcct cccaacaccc tgcttggaaa 28260 ccttccagag ccaccccggt gtgcctccgt cttctctccc cttcccccac cccttgccgg 28320 cgatctcatt cttgccaggc tgacatttgc atcggtgggc gtcaggcctc actcgggggc 28380 caccgttttt gaagatgggg gcggcacggt cccacttccc cggaggcagc ttgggccgat 28440 ggcatagccc cttgacccgc gtgggcaagc gggcgggtct gcagttgtga ggcttttccc 28500 cccgctgctt cccgctcagg cctccctccc taggaaagct tcaccctggc tgggtctcgg 28560 tcacctttta tcacgatgtt ttagtttctc cgccctccgg ccagcagagt ttcacaatgc 28620 gaagggcgcc acggctctag tctgggcctt ctcagtactt gcccaaaata gaaacgcttt 28680 ctgaaaacta ataactttnc tcacttaaga tttccaggga cggcgccttg gcccgtgttt 28740 gttggcttgt tttgtttcgt tctgttttgt tttgttcgtg tttttccttt ctcgtatgtc 28800 tttcttttca ggtgaagtag aaatccccag ttttcaggaa gacgtctatt ttccccaaga 28860 cacgttagct gccgtttttt cctgttgtga actagcgctt ttgtgactct ctcaacgctg 28920 cagtgagagc cggttgatgt ttacnatcct tcatcatgac atcttatttt ctagaaatcc 28980 gtaggcgaat gctgctgctg ctcttgttgc tgttgttgtt gttgttgttg tcgtcgttgc 29040 tgttgtcgtt gtcgttgttg ttgtcgttgt cgttgttttc aaagtatacc ccggccaccg 29100 tttatgggat caaaagcatt ataaaatatg tgtgattatt tcttgagcac gcccttcctc 29160 cccctctctc tgtctctctg tctgtctctg tctctctctt tctctgtctg tcttctctct 29220 ctctctctct ctgtgtctct ctctctctgc ctgtctgttt ctctctctct gcctctctct 29280 ctctctctct ctctgcctgt ctctctcact gtgtctgtct tctgtcttac tccctttctc 29340 tgtctgtctg tcggtctctc tctctctctc tccctgtctg tatgtttctc tctgtctctg 29400 tctctctctc tctttctgtt tctctctctc cgtctctgtc tttctctgac tgtctctctc 29460 tttccttctc tctgtctctc tctgcctgtc tctctcactc tgtcttctgt cttatctctc 29520 tctctgcctg cctgtctctc tcactctctc tctctgtgtg tctctctctc tctttctgtt 29580 tctctctgtc tctctgtccg tctctgtctt tctctgtctg tctctttgtc tgtctgtctt 29640 tgtctttcct tctctctgtc tctgtctctc tcactgtgtc tgtcttctgt cttagtctct 29700 ctctctctct ctccctgtct gtctgtctct ctctctctct ccccctgtct gtttctctct 29760 ctctctctct ctctctctct ctctgtcttt gtctttcttt ctgtctctgt ctctctctct 29820 ctctctgtgt gtctgtcttc tgtcttactg tctttctctg cctgtctgtc tgtctgtctc 29880 tctctgtctg tctctctctc tctctccccc tgtcggctgt ttctctgtct ctgtctgtgt 29940 ctctctttct gtctgtttct ctctgtctgt ctttctctct ctgtctcttt ctctctgtct 30000 ctctgtctgt ctctgtctct ctctctgtct ctctctctct gtgggggtgt gtgtgtgtgt 30060 gtgtatgtgt gtgtgtgtgt gtgtgtgtgt ctgccttctg tcttactctc tttctctgcc 30120 tgtctgtctg cctgtctgtt tgtctctctc tctctgcctg tctctctccc ttcctgtctg 30180 tttctctctc tttctgtttc tctctgtctc tgtccatctc tgtctttctc cgtctgtctc 30240 tttatctgtc tctctccgtc tgtctcttta tctgtctctc tctctctttc tgtctttctc 30300 tctctgtgta tcgttgtctc tctctgtctg tctctgtctc tgtctctctg tctctctctc 30360 tctctctctc tctctgtctg tctgtccgtc tgtctgtctc ggtctctgcg tctcgctatc 30420 tcccgccctc tctttttttg caaaagaagc tcaagtacat ctaatctaat cccttaccaa 30480 ggcctgaatt cttcacttct gacatcccag atttgatctc cctacagaat gctgtacaga 30540 actggcgagt tgatttctgg acttggatac ctcatagaaa ctacatatga ataaagatcc 30600 aatcctaaaa tctggggtgg cttctccctc gactgtctcg aaaaatcgta cctctgttcc 30660 cctaggatgc cggaagagtt ttctcaatgt gcatctgccc gtgtcctaag tgatctgtga 30720 ccgagccctg tccgtcctgt ctcaaatatg tacgtgcaaa cacttctctc catttccaca 30780 actacccacg gccccttgtg gaaccactgg ctctttgaaa aaaatcccag aagtggtttt 30840 ggctttttgg ctaggaggcc taagcctgct gagaactttc ctgcccagga tcctcgggac 30900 catgcttgct agcgctggat gagtctctgg aaggacgcac gggactccgc aaagctgacc 30960 tgtcccaccg aggtcaaatg gatacctctg cattggcccg aggcctccga agtacatcac 31020 cgtcaccaac cgtcaccgtc agcatccttg tgagcctgcc caaggccccg cctccgggga 31080 gactcttggg agcccggcct tcgtcggcta aagtccaaag ggatggtgac ttccacccac 31140 aaggtcccac tgaacggcga agatgtggag cgtaggtcag agaggggacc aggaggggag 31200 acgtcccgac aggcgacgag ttcccaaggc tctggccacc ccacccacgc cccacgcccc 31260 acgtcccggg cacccgcggg acaccgccgc tttatcccct cctctgtcca cagccggccc 31320 caccccacca cgcaacccac gcacacacgc tggaggttcc aaaaccacac ggtgtgacta 31380 gagcctgacg gagcgagagc ccatttcacg aggtgggagg ggtgggggtg gggtgggttg 31440 ggggttgtgg ggtctgtggc gagcccgatt ctccctcttg ggtggctaca ggctagaaat 31500 gaatatcgct tcttgggggg aggggcttcc ttaggccatc accgcttgcg ggactacctc 31560 tcaaaccctc ccttgaggcc acaaaataga ttccacccca cccatcgacg tttcccccgg 31620 gtgctggatg tatcctgtca agagacctga gcctgacacc gtcgaattaa acaccttgac 31680 tggctttgtg tgtttgtttg tttctgagat ggagtcttgc tctgtccccc aggctggagt 31740 gcagtggcgt gatctcagct cactggaacc tctgcctcct gggttcaagt gattctcctg 31800 tctcagcgcc accatggccg gctcattttt tttttttttt tttttggtag acacggggtt 31860 tcaccctctt tcattggttt tcactggaga ttctagattc gagccacacc tcattccgtg 31920 ccacagagag acttcttttt tttttttttt tttttaagcg caacgcaaca tgtctgcctt 31980 atttgagtgg cttcctatat cattataatt gtgttataga tgaagaaacg gtattaaaca 32040 ctgtgctaat gatagtgaaa gtgaagacaa aagaaaggct atctattttg tggttagaat 32100 aaagttgctc agtatttaga agctacctaa atacgtcagc atttacactc ttcctagtaa 32160 aagctggccg atctgaataa tcctccttta aacaaacaca atttttgata gggttaagat 32220 ttttttaaga atgcgactcc tgcaaaatag ctgaacagac gatacacatt taaaaaaata 32280 acaacacaag gatcaaccag acttgggaaa aaatcgaaaa ccacacaagt cttatgaaga 32340 actgagttct taaaatagga cggagaacgt agctatcgga agagaaggca gtattggcaa 32400 gttgattgtt acgttggtca gcagtagctg gcactatctt tttggccatc tttcgggcaa 32460 tgtaactact acagcaaaat gagatatgat ccattaaaca acatattcgc aaatcaaaaa 32520 gtgtttcagt aatataatgc ttcagattta gaagcaaatc aaatgataga actccactgc 32580 tgtaataagt caccccaaag atcaccgtat ctgacaaaat aactaccaca gggttatgac 32640 ttcagaatca tactttcttc ttgatattta cttatgtatt tatttttttt aatttatttc 32700 tcttgagacg cgtctcgctc tgtcgcccag gctggagtgc gatggtgtga tctcggctca 32760 ctgcaaccgc cacctccctg ggttcaagcg attctcctgc ctcagcctcc cgagtagctg 32820 ggactacagg tgcccgccac cacgcccagc taatctttat acttttaata gagacggggt 32880 ttcaccgtgt cggcccggat ggtctcgatc tcttgacctc gtgacccgcc cgcctcggcc 32940 tcccaaagtg ctgggatgac aggcgtgagc cactgagccc ggccttctct tgacgtttaa 33000 actatgaagt cagtccagag aaacgcaata aatgtcaacg gtgaggatgg tgttgaggca 33060 gaagtaggac cacacttttt cctatcttat tcagttgata acaatatgac ctaggtagta 33120 atttcctatg tgcctactta tacacgagta caaaagagta aaacagagag actgctaaat 33180 taaagggtac gtgaagttct tcatagtaac tccgtaaact ggaacactgt caaaaagcag 33240 cagctagtga attgtttcca tgtatttttc tattatccaa taagtgaact atgctattcc 33300 tttccagtct cccaagcact tcttgtcccc atcaccactt cggtgctcga agaaaaagta 33360 agcaaatcaa ggaacacaag ctaaagaaac acacacacaa accaaagaca actacagcgt 33420 ctgcaaaagt ttgctagaag actgaaactg ttgagtataa ggatctggta ttctacgatc 33480 atgagttcac ttcagagttt gttcaagaca tacgtttcgt aaggaaacat cttagttaga 33540 agttattcag cagtaggtac catccctaag tatttttcac caaatccgtg acaataaaga 33600 gctatctaac cagaaaaatt agcgagtacg ggcaccatcc atagggcttt gtctttacgc 33660 ttcattagca cttaccatgc cttacaatgt ctaggattga ccctgatagc atttcgaaaa 33720 caagctaatg ctttgtccag ttcttcagtg aagacaactc acgccctaat gcgctatagg 33780 cataagcatc atttggatcc acttcgagag ttctctggaa gaattgaatc gcaatatcgt 33840 gttcccgttt gcagaccgaa acagtttccc tgcagcacac caggcctctg gctggcgaat 33900 ttttatccat gtctgtgaag tctttggaca gaactgaaag agcaacctct ttcggaggat 33960 gccaaagtgt tgtagagtag atctccatgc cttcgactct gtaattctca atcctcctaa 34020 cctctgagaa ttgtctttca gcttgcgtgg actctgaaag tttacaatag gccntttccg 34080 atttggcaca gtacccaacc ggtattgcag tggtgagaag ctagatggct caagatgctg 34140 atagcttctt tgccgtggta agaacacaaa gctaaataac ctttccccct ttcacgaaga 34200 aggctcatca agccttccgc tgctgctttt tgtagattaa aagcctgaat ctgaggcgcg 34260 attgcggcta ttttcccttc tgaaatgacg gaagagtcca attttgtcac ttccaggcta 34320 tcacttatgt tcggtggagt tattgctcct ttattagttt tacttttggt tcttctgttt 34380 gggattttag gtggaaactt catttttaat tttctcctaa ttctcctcgg ttgtggagct 34440 gtcactagtc aagagtcgtg aatttcttcg aggncggtgc atttggggga gatgccatag 34500 tggggctcaa tacctgaggt gttgcccttg tcggcggacc agaactttgt gtttttgcaa 34560 ggactggagt tacctttcgg ctctttcccc tctgcgagaa gacagacggt gttccggttt 34620 ggccgattct ggcaacaggc ttttctgaag gggctccggt ggatggcacg tcagtgacag 34680 acggtgtctc ataccagtgc agttttgtca atagggtccg tctccgggac ttggggtttc 34740 taatggcaaa atgccaacac ttggggttaa tggactaaca gctgctggtc ctcctaataa 34800 acttcgacca gtttttggtt tatgttgaac ctgtttagat catatggaag ttcctgttcc 34860 cagtgggaca gtatcaggtg aaaggacagc tgaatcgata gaagacactg gggagtctgt 34920 attcaaggag tactttgaat tggaagattc taaattccat ccgtttcatt cgacggtgtc 34980 ctggggtgtt tccgtaagaa cggtctcggg ctgtctgtga cataaactag gacgaggtcc 35040 aagtgttgtg gcgcaacact tggacaggca gttgctaaag ctctctagag aggtgaatca 35100 aaatgtttgg tcaggatctg gcttttcccc cctatttcac atcatgattc aaagggacac 35160 cagaggaaag gatttcaacg aaggctcttt tggtcacatt ctgatccttt ggtaagccga 35220 tctgtcttgc aatatacatg tcccgacgat ggaaggggaa agcgagctga atcaccaaac 35280 tcaggaacga taatatcatc gtggcttttc tgcttatgaa acactccacc cgataagatt 35340 tgatcccctt ctgcaagctt gctgagatca acacaacatt tcgcaagcag gcatttgcat 35400 tgcggggtag tacaactgtg tcctttcaag agtctatatg ttttataggc ctttcctgag 35460 cggtaagaac aggtcgccag taagaacaag gcttcttctg agtgtacttc tgcataaagg 35520 cgttctgcgg gggaaaccgc atctcggtag gcatagtggt ttagtgcttg ccatatagca 35580 gcctggacgg gtccctgcag caccgccatc ctcgaggctc aggcccactt tctgcagtgc 35640 cacaggcacc cccccccccc catagcggct ccggcccggc cagccccggc tcatttaaag 35700 gcaccagccg ccgttaccgg gggatggggg agtccgagac agaatgactt ctttatcctg 35760 ctgactctgg aaagcccggc gccttgtgat ccattgcaaa ccgagagtca cctcgtgttt 35820 agaacacgga tccactccca agttcagtgg ggggatgtga ggggtgtggc aggtaggacg 35880 aaggactctc ttccttctga ttcggtctgc acagtggggc ctagggctgg agctctctcc 35940 gtgcggaccg ctgactccct ctaccttggg ttccctcggc cccaccctgg aacgccgggc 36000 cttggcagat tctggccctt tctggccctt cagtcgctgt cagaaacccc atctcatgct 36060 cggatgcccc gagtgactgt ggctcgcacc tctccggaaa cattggaaat ctctcctcta 36120 cgcgcggcca cctgaaacca caggagctcg ggacacacgt gctttcggga gagaatgctg 36180 agagtctctc gccgactctc tcttgacttg agttcttcgt gggtgcgtgg ttaagacgta 36240 gtgagaccag atgtattaac tcaggccggg tgctggtggc tcacgcctgt aaccccaaca 36300 ctttgggagg ccgaggccgt aggatccctc gaggaatcgc ctaaccctgg ggaggttgag 36360 gttgcagtga gtgagccata gttgtgtcac tgtgctccag tctgggcgaa agacagaatg 36420 aggccctgcc acaggcaggc aggcaggcag gcaggcagaa agacaacagc tgtattatgt 36480 tcttctcagg gtaggaagca aaaataacag aatacagcac ttaattaatt tttttttttt 36540 ccttcggacg gagtttcact cttggtgccc acgctggagt gcagtggcac catctcggct 36600 caccgcaacc tccacctccc gcgttcaagc gattctcctg cctcagcctc ctgagtagct 36660 gggattacag ggaggagcca ccacacccag ctgattttgt attgttagta gagacggcat 36720 ttctccatgt gggtcaggct ggtctcgaac tggcgacccc agtggatctg cccgccccgg 36780 cctcccaaag tgctggggtg acaggcgtga gccatcgtga ctggccggct acgtttattt 36840 atttattttt ttaattattt tacttttttt tagttttcca ttttaatcta tttatttatt 36900 tacatttatt tatttattta tttatttact tatttattta ttttcgagac agactctcgc 36960 tctgctgccc aggctggagt gcagcggcgt gatctcggct cactgcaacg tccgcctccc 37020 gggttcacgc cattctcctg cctcagcctc ccaagtagct gggactacag gcgcccgcca 37080 ccgtgcccgg ctaacttttt gtattttgag tagagatggg gtttcactgt ggtagccagg 37140 atggtctcga tctcctgacc ccgtgatccg tccacctcgg cctcccaaag tgctgggatg 37200 acaggcgtga gccaccggcc ccggcctatt tatctattta ttaactttga gtccaggtta 37260 tgaaaccagt tagtttttgt aatttttttt tttttttttt ttttttgaga cgaggtttca 37320 ccgtgttgcc aaggcttgga ccgagggatc caccggccct cggcctccca aaagtgcggg 37380 gatgacaggc gcgagcctac cgcgcccgga cccccccttt ccccttcccc cgcttgtctt 37440 cccgacagac agtttcacgg cagagcgttt ggctggcgtg cttaaactca ttctaaatag 37500 aaatttggga cgtcagcttc tggcctcacg gactctgagc cgaggagtcc cctggtctgt 37560 ctatcacagg accgtacacg taaggaggag aaaaatcgta acgttcaaag tcagtcattt 37620 tgtgatacag aaatacacgg attcacccaa aacacagaaa ccagtctttt agaaatggcc 37680 ttagccctgg tgtccgtgcc agtgattctt ttcggtttgg accttgactg agaggattcc 37740 cagtcggtct ctcgtctctg gacggaagtt ccagatgatc cgatgggtgg gggacttagg 37800 ctgcgtcccc ccaggagccc tggtcgatta gttgtgggga tcgccttgga gggcgcggtg 37860 acccactgtg ctgtgggagc ctccatcctt ccccccaccc cctccccagg gggatcccaa 37920 ttcattccgg gctgacacgc tcactggcag gcgtcgggca tcacctagcg gtcactgtta 37980 ctctgaaaac ggaggcctca cagaggaagg gagcaccagg ccgcctgcgc acagcctggg 38040 gcaactgtgt cttctccacc gcccccgccc ccacctccaa gttcctccct cccttgttgc 38100 ctaggaaatc gccactttga cgaccgggtc tgattgacct ttgatcaggc aaaaacgaac 38160 aaacagataa ataaataaaa taacacaaaa gtaactaact aaataaaata agtcaataca 38220 acccattaca atacaataag atacgatacg ataggatgcg ataggatacg ataggataca 38280 atacaatagg atacgataca atacaataca atacaataca atacaataca atacaataca 38340 atacaataca atacaatacg ccgggcgcgg tggctcatgc ctgtcatccc gtcactttgg 38400 gatgccgagg tggacgcatc acctgaagtc gggagttgga gacaagcccg accaacatgg 38460 agaaatcccg tctcaattga aaatacaaaa ctagccgggc gcggtggcac atgcctataa 38520 tcccagctgc taggaaggct gaggcaggag aatcgcttga acctgggaag cggaggttgc 38580 agtgagccga gattgcgcca tcgcactcca gtctgagcaa caagagcgaa actccgtctc 38640 aaaaataaat acataaataa atacatacat acatacatac atacatacat acatacatac 38700 ataaattaaa ataaataaat aaaataaaat aaataaatgg gccctgcgcg gtggctcaag 38760 cctgtcatcc cctcactttg ggaggccaag gccggtggat caagaggcgg tcagaccaac 38820 agggccagta tggtgaaacc ccgtctctac tcacaataca caacattagc cgggcgctgt 38880 gctgtgctgt actgtctgta atcccagcta ctcgggaggc cgagctgagg caggagaatc 38940 gcttgaacct gggaggcgga ggttgcagtg agccgagatc gcgccactgc aacccagcct 39000 gggcgacaga gcgagactcc gtctccaaaa aatgaaaatg aaaatgaaac gcaacaaaat 39060 aattaaaaag tgagtttctg gggaaaaaga agaaaagaaa aaagaaaaaa acaacaaaac 39120 agaacaaccc caccgtgaca tacacgtacg cttctcgcct ttcgaggcct caaacacgtt 39180 aggaattatg cgtgatttct ttttttaact tcattttatg ttattatcat gattgatgtt 39240 tcgagacgga gtctcggagg cccgccctcc ctggttgccc agacaacccc gggagacaga 39300 ccctggctgg gcccgattgt tcttctcctt ggtcaggggt ttccttgtct ttcttcgtgt 39360 ctttaacccg cgtggactct tccgcctcgg gtttgacaga tggcagctcc actttaggcc 39420 ttgttgttgt tggggacttt cctgattctc cccagatgta gtgaaagcag gtagattgcc 39480 ttgcctggcc ttgcctggcc ttgccttttc tttctttctt tctttcttta ttactttctc 39540 tttttcttct tcttcttctt cttttttttg agacagagtt tcactcttgt tgcccaggct 39600 agagggcaat ggcgcgatct cggctcaccg caccctccgc ctcccaggtt caagcgattc 39660 tcctgcctca gcctcctgat tagctgggat tacaggcatg ggccaccgtg ctggctgatg 39720 tttgtacttt tagtagagac ggtgtttttc catgttggtc aggctggtct cccactccca 39780 acctcaggtg gtccgcctgc cttagcctcc caaagtgctg ggatgacagg cgtgcaaccg 39840 cgcccagcct ctctctctct ctctctctct ctcgctcgct tgcttgcttg ctttcgtgct 39900 ttcttgcttt cccgttttct tgctttcttt ctttctttcg tttctttcat gcttgctttc 39960 ttgcttgctt gcttgctttc gtgctttctt gctttcctgt tttctttctt tctttctttc 40020 tttctttctt ttgtttcttt cttgcttgct ttcttgcttg cttgcttgct ttcgtgcttt 40080 cttgctttcc tgttttcttt ctttctttct ttcttttctt tctttcttgc ttgctttcct 40140 gcttgcttgc tttcgtgctt tcttgttttc tcgatttctt tctttctttt gtttctttcc 40200 tgcttgcttt cttgcttgct tgctttcgtg cttcttgctt tcctgttttc tttctttctt 40260 tctttctttt gtttctttct tgcttgcttt cttgcttgct tgctttcgtg ctgtcttgtt 40320 tctcgatttc tttctttctt ttgtttcttt cctgcttgct ttcttgcttg attgctttcg 40380 tgctttcttg ctttcttgtt ttctttcttt cttttgtttc tttctttctt gcttccttgt 40440 tttcttgctt tcttgcttgc ttgctttcgt gctttcttgt tttcttgctt tctttctttt 40500 gtttctttct tgcttgcttt cttgcttcct tgttttcttg ctttcttgct tgcttgcttt 40560 cgtgctttct ttcttgcttt cttttctttc tttcttttct ttttctttct ttcttgcttt 40620 cttttctttc atcatcatct ttctttcttt cctttctttc tttctttctt tctatctttc 40680 tttctttctt tctttctttc tttctttctt tctttctgtt tcgtcctttt gagacagagt 40740 ttcactcttg tttccacggc tagagtgcaa tggcgcgatc ttggctcacc gcaccttccg 40800 cctcccgggt tcgagcgctt ctcctgcctc cagcctcccg attagcgggg attgacaggg 40860 aggcaccccc acgcctggct tggctgatgt ttgtgttttt agtaggcacg ccgtgtctct 40920 ccatgttgct caggctggtc tccaactccc gacctcctgt gatgcgccca cctcggcctc 40980 tcgaagtgct gggatgacgg gcgtgacgac cgtgcccggc ctgttgactc atttcgcttt 41040 tttatttctt tcgtttccac gcgtttactt atatgtatta atgtaaacgt ttctgtacgc 41100 ttatatgcaa acaacgacaa cgtgtatctc tgcattgaat actcttgcgt atggtaaata 41160 cgtatcggtt gtatggaaat agacttctgt atgatagatg taggtgtctg tgttatacaa 41220 ataaatacac atcgctctat aaagaaggga tcgtcgataa agacgtttat tttacgtatg 41280 aaaagcgtcg tatttatgtg tgtaaatgaa ccgagcgtac gtagttatct ctgttttctt 41340 tcttcctctc cttcgtgttt ttcttccttc ctttcttcct ttctctcctt ctttaggttt 41400 ttcttcctct cttcctttcc ttctttctct ctttctgtcc ttttttcctt cgtgctttat 41460 ttctctttcg ttccctgtgt ttccttcttt tttctttcct ctctgtttct ttttcccttc 41520 tttccttcgt ttctttcctc attctttctc tctttttcgt tgtttctttc cttcccgtct 41580 gtcttttaaa aaattggagt gtttcagaag tttactttgt gtatctacgt tttctaaatt 41640 gtctctcttt tctccatttt cttcctccct ccctccctcc ctccctgctc ccttccctcc 41700 ctccttccct ttcgccatct gtctcttttc cccactcccc tccccccgtc tgtctctgcg 41760 tggattccgg aagagcctac cgattctgcc tctccgtgtg tctgcagcga ccccgcgacc 41820 gagtccttgt gtgttctttc tccctccctc cctccctccc tccctccctc cctccctgct 41880 tccgagaggc atctccagag accgcgccgt gggttgtctt ctgactctgt cgcggtcgag 41940 gcagagacgc gttttgggca ccgtttgtgt ggggttgggg cagaggggct gcgttttcgg 42000 cctcgggaag agcttctcga ctcacggttt cgctttcgcg gtccacgggc cgccctgcca 42060 gccggatctg tctcgctgac gtccgcggcg gttgtcgggc tccatctggc ggccgctttg 42120 agatcgtgct ctcggcttcc ggagctgcgg tggcagctgc cgagggaggg gaccgtcccc 42180 gctgtgagct aggcagagct ccggaaagcc cgcggtcgtc agcccggctg gcccggtggc 42240 gccagagctg tggccggtcg cttgtgagtc acagctctgg cgtgcaggtt tatgtggggg 42300 agaggctgtc gctgcgcttc tgggcccgcg gcgggcgtgg ggctgcccgg gccggtcgac 42360 cagcgcgccg tagctcccga ggcccgagcc gcgacccggc ggacccgccg cgcgtggcgg 42420 aggctgggga cgcccttccc ggcccggtcg cggtccgctc atcctggccg tctgaggcgg 42480 cggccgaatt cgtttccgag atccccgtgg ggagccgggg accgtcccgc ccccgtcccc 42540 cgggtgccgg ggagcggtcc ccgggccggg ccgcggtccc tctgccgcga tcctttctgg 42600 cgagtccccg tggccagtcg gagagcgctc cctgagccgg tgcggcccga gaggtcgcgc 42660 tggccggcct tcggtccctc gtgtgtcccg gtcgtaggag gggccggccg aaaatgcttc 42720 cggctcccgc tctggagaca cgggccggcc cctgcgtgtg gccagggcgg ccgggagggc 42780 tccccggccc ggcgctgtcc ccgcgtgtgt ccttgggttg accagaggga ccccgggcgc 42840 tccgtgtgtg gctgcgatgg tggcgttttt ggggacaggt gtccgtgtcc gtgtcgcgcg 42900 tcgcctgggc cggcggcgtg gtcggtgacg cgacctcccg gccccggggg aggtatatct 42960 ttcgctccga gtcggcaatt ttgggccgcc gggttatat 42999 74 814 DNA Homo sapiens 74 cagtcttgag cattcagcag attcaagatg aagctgaaca tctccttccc agccactggc 60 tgccagaaac tcattgaagt ggacgatgaa cgcacacttc gtactttcta tgagaagcgt 120 atggccacag aagttgctgc tgacgctctg ggtgaagaat ggaagggtta tgtggtccga 180 atcagtggtg ggaacgacaa acaaggtttc cccatgaagc agggtgtctt gacccatggc 240 cgtgtccgcc tgctactgag taaggggcat tcctgttaca gaccaaggag aactggagaa 300 agaaagagaa aatcagttcg tggttgcatt gtggatgcaa atctgagcgt tctcaacttg 360 gttattgtaa aaaaaggaga gaaggatatt cctggactga ctgatactac agtgcctcgc 420 cgcctgggcc ccaaaagagc tagcagaatc cgcaaacgtt tcaatctctc taaagaagat 480 gatgtccgcc agtatgttgt aagaaagccc ttaaataaag aaggtaagaa acctaggacc 540 aaagcaccca agattcagcg tcttgttact ccacgtgtcc tgcagcacaa acggcggcgt 600 attgctctga agcaacagcg taccaagaaa aataaagaag aggctgcaga atatgctaaa 660 cttttggcca agagaatgaa ggaggctaag gagaagcgcc aggaacaaat tgcgaagaga 720 cgcagacttt cctctctgcg agcttctact tctaagtctg aatccagtca gaaatagatt 780 ttttgagtaa gaaataaata agatcagact ctga 814 75 7242 DNA Drosophila melanogaster 75 cat cct cgc ccg ttt cca cgc cgt cgt cct cct cat cat cgg cga gag 48 ctg att gcg tgg tgg tca gag gcg aac cag cgg tct tcg tgg agc tgg 96 gac cca gat caa ggc tgc tca aca gat tgc ctg ccg act ggg aag acg 144 tta ggg tgt cct tgt gat agg agc tgt gcc gat tgc cca gct tag tgg 192 ata gtg tta ggt cgc cgt tgc tcg ttg ggc gta gac tgc cca cca cct 240 gac cac cgg gca ggg tgg cgc ttc tct tgt ggc gac cct tcg act tgg 288 gaa agg cag cca gga tgt tga gcc acc act ggg att cct ctg aac tgg 336 tgc cct tca caa agg tca cgc gct cgg gag cgg tta tgg cga tgg agt 384 tgg ggt gac ctg tca cct cca cgg cgc tgg taa cct cca gca ctt tgg 432 tca t atcaacgcac gcctgcggta tggtttcggg ctatagaaaa tatatgtaaa 486 ttaaagagta aacaagttgt attttaagat tttaattagg agaattaatt aatcggtaat 546 caaatgaact cggcctatcg cgtaataata tacatttttt aatttaatga ctaataaata 606 atataaaatc taattaatag ttcagtaagt tagtaaaagt aaatcaatct ggtggcaatt 666 taaaaagcca ctttaattct ttcacttcat aaataatcgg ctggttaagg aaaggtacat 726 ttattgttgt ttttaccgcc agcacactta ttggttctac cgataacgtc cagcaatgct 786 ataataccat aattagaagc tcttttggga ttgtataata ttttatgagc tttgttatct 846 tataaattca gaccacccca taacagagaa ttttattatc tttttatttt tttgttatca 906 ctggagaacc aacgagacgg tatttaaaca agaaaactac aattatgcct atggattgat 966 tagctattac aaactcaaaa attcgtttta attttattat tagctataaa aatggaaatg 1026 gtttaaaata tgttcaaatg aattacttac ataatcatca accgagtacg tcagctcgcc 1086 atcatcatag agaacaaacc atcttctttg ccagcgctac aattgaaaaa gaagacaaat 1146 tttattaaaa atattaaact attctcaagt ttctattatt tgatctttac taagtctaag 1206 tctgtggcta tcagtcgatg agagtgatca actctaaaac aaatttacat tgtcgcttgc 1266 aatttgcaac atgaaaggtg ggacgagaaa tggtgaggaa agacaagatc ggatgtaaat 1326 aatgttcaac gcccccgaca gaaatcataa ttcctttata attcgttctt tcataaattt 1386 tcaggcgttg tcattcccat gaaaggcaac caagccccaa acgccttcgc ctttgcattg 1446 gcactgattc gctgtggatc tggatctcta tctgtatctg catctgtatc tgtatctgaa 1506 tatgaatctg aatcggaatc tggatctgat tctcattgtt attgttggtt gccagaatca 1566 taacaaacgt gcaacagcca caagggtata ggactcaacg tgtgtctgat atttatgcaa 1626 attgttaaaa gtcaaagcaa attaagctca accttcagcg aagatgacgt tgaattctgt 1686 tgccccattg cgctgcaagt tgctagttgc aagttgcaag ttgcaccttt ctgcagttga 1746 tttctcctca tccacctatg cagtcaggtg agagggagtg agtgcgagtg gagtgctgag 1806 gtgtgtcaag cgaattattt ataaggccta gaagaaggca gctcgcacgc gaataatcaa 1866 gactcagcac caatttttag tttatggtct agttctttat aggttttgta cctctttttt 1926 ttgcgttggc tattttgcga ttgaattcat aaatatggaa tcaaatctat agagtggaga 1986 gtggaactaa cgaggtgaga ggtaacaata tagtttttgg gcaatcagaa gcaacaaaca 2046 aatatctgca ataactcgtt gaattcgaaa caaaattaac tgcatttata ctaaatatat 2106 aattgctata ggatgagtta gccgtcttgc ggtttcccaa accccaaaag caaagtcaag 2166 cgtgtaggaa acctgatcag atcgcgggaa agattctctg cactcaatta cgtcaaacca 2226 ggttgatttc ctccttttcg ctgtcgagag attggcaaat gggtcaaatg ggtcaggcag 2286 tggaatagta aattagatta tgtttgcatc gagatgcaat gcaagccgcg ccccaaataa 2346 atggaacgtg cgctagtagg gttccccctt gcccctggta acccttcctt tcaccacccg 2406 ttttcccgct tttccgctcc caaacactag aggtaagctg cttagacccc ggcgtttaga 2466 agccccagtt tcgtttcact aggcagacac actcgcagcg ggaagacaat gccagatata 2526 aggcgcacac tgtccactgc acggtgtaca ctgataaaaa tatatatcaa gaccaaatat 2586 tgttaaagat aattgatgtg taaaggaaat acacttgcaa gttaaaatgt tttcacctta 2646 atgtgttttt cttttaataa ctctattaac taatataaat tatcaccaaa acaaaacatt 2706 aatttgggaa atgttatcac caaaagcttt tgccactata gaaaatacag ataaatctaa 2766 aaataaattc ctttgacgca tgcacgaaat aagataaaca aatttgattt tattttctta 2826 tttacacaat tcattttatt tgcatgcatt tcattttgtt cagtgtacct aataaaaacg 2886 atttcgtttg ccccaagcag taagaagatg ttaggcacgt ctgctgataa ggaaaactgt 2946 agccccagac taggccagac catattaaat taacgtctgg aggcgcgaac agtcatacga 3006 ttttttttta tattacttcg cggtcagttg ccaaggcagg agagcaaccc gttcgattag 3066 tgggtcaatt tggaaaatga gttattgact ctgggaaatt gttgagctga aaatttaatc 3126 ggagcccgaa aatttccaat catgcattcc ccaagtgacc atatatggat tagtgataac 3186 gctcgatgcg acccccaaag attatcaaaa atatttaata agaatatatg aaaaaaagat 3246 ttaactttta tgaattccta agcgtcctca aagcttcggg agaactgggc catatatgac 3306 ccgaaataca tgtttatact ttagcaaatg tattttccaa ttaggtgata gaacttgtgt 3366 gcacacacac atatagttct atatcaacaa acaggtttaa gttttatgca aattgaaagc 3426 ttatttcttc cgcatgctta tctctttcct tctcatcatt tgtatgcaaa aaatacatat 3486 gaatttgcag tagcctcctc ccacatcata tttaacgccc tatattcaaa atttgctcaa 3546 gaaaatattt gaaccaaatt gatttttagt caattagttt ttaagtaatt aagtggagta 3606 aacatataca attttattct taccaaacac atatactcat atattttgaa taaataaata 3666 aacaaatata tataaaatct acgaaattgg caaacaaatt ttaaagcatt atagtattgc 3726 cgatttaatt aatataatta aataatatgt acatgtatta atcttgtgtg cgagcatggg 3786 ttaaatctag ctgcattcga aaccgctact ctggctcggc cacaaagtgg gcttggtcgc 3846 tgttgcggac aagtgagatt gctaatgagc tgcttttagg gggcgtgttg tgcttgcttt 3906 ccaacttttc tagattgatt ctacgctgcc tccagcagcc acccctccca tcaccattcc 3966 catcaccatc cagtcccgtt ggctcccagt cacagtatta cacgtatgca aattaagccg 4026 aagttcaatt gcgaccgcag caacaacacg atctttctac acttctcctt gctatgcttg 4086 acattcacaa ggtcaaagct cttaatattc tggctcgtgg ccctacactg taagaaatta 4146 ctatagaaat aacggtacac ggaataagat atttttttta gtccatatgc ttttaacaaa 4206 tgtgttttaa gtttatgtta tattattgtt agaaaaccgg tgttttttta aaatcggtta 4266 aaaaattact acgagagaaa aatacaaatt ttgtaaataa gattgactct ttttcgattt 4326 tggaatattt tcattcattt tatgttttta cgttttcact tatttgtttc tcagtgcact 4386 ttctggtgtt ccattttcta ttgggctctt taccccgcat ttgtttgcag atcacttgct 4446 tgcgcatttt tattgcattt tacatattac acattatttg aacgccgctg ctgctgcatc 4506 cgtcgacgtc gactgcactc gcccccacga gagaacagta tttaaggagc tgcgaaggtc 4566 caagtcatgc attattgtct cagtgcagtt gtcagttgca gttcagcaga cgggctaacg 4626 agtacttgca tctcttcaaa tttacttaat tgatcaagta agtagcaaaa gggcacacaa 4686 ttgaaggaaa ttcttgttta attgaattta ttgtgcaagt gcggaaataa aatgacagga 4746 ttgaatagta aatattttgt aaaatcatat ataatcaaat ttattcaatc agaactaatt 4806 caagctgtca caagtagtgc gaactcaatt aattggcatc gaattaaaat ttggaggcct 4866 gtgccgcata ttcgtcttgg aaaatcacct gttagttaac ttctaaaaat aggaatttta 4926 acataactcg tccctgttaa tcggcgccgt gccttcgtta gctatctcaa aagcgagcgc 4986 gtgcagacga gcagtaattt tccaagcatc aggcatataa tatactaata ctaatactaa 5046 tactaatata agaatactaa tatagaaaaa gctttgccgg tacaaaatcc caaacaaaaa 5106 caaaccgtgt gtgccgaaaa ataaaaataa accataaact aggcagcgct gccgtcgccg 5166 tctgagcagc ctgcgtacat agccgagatc gcgtaacggt agataatgaa aagctctacg 5226 taaccgaagc ttctgctgta cggatcttcc tataaatacg gggccgacac gaactggaaa 5286 ccaacaacta acggagccct cttccaattg aaacagatcg aaagagcctg ctaaagcaaa 5346 aaagaagtca ccatgtcgtt tactttgacc aacaagaacg tgattttcgt ggccggtctg 5406 ggaggcattg gtctggacac cagcaaggag ctgctcaagc gcgatctgaa ggtaactatg 5466 cgatgcccac aggctccatg cagcgatgga ggttaatctc gtgtattcaa tcctagaacc 5526 tggtgatcct cgaccgcatt gagaacccgg ctgccattgc cgagctgaag gcaatcaatc 5586 caaaggtgac cgtcaccttc tacccctatg atgtgaccgt gcccattgcc gagaccacca 5646 agctgctgaa gaccatcttc gcccagctga agaccgtcga tgtcctgatc aacggagctg 5706 gtatcctgga cgatcaccag atcgagcgca ccattgccgt caactacact ggcctggtca 5766 acaccacgac ggccattctg gacttctggg acaagcgcaa gggcggtcca ggtggtatca 5826 tctgcaacat tggatccgtc actggattca atgccatcta ccaggtgccc gtctactccg 5886 gcaccaaggc cgccgtggtc aacttcacca gctccctggc ggtaagttga tcaaaggaaa 5946 cgcgaagttt tcaagaaaaa acaaaactaa tttgatttat aacaccttta gaaactggcc 6006 cccattaccg gcgtgacggc ttacactgtg aaccccggca tcacccgcac caccctggtg 6066 cacacgttca actcctggtt ggatgttgag cctcaggttg ccgagaagct cctggctcat 6126 cccacccagc cctcgttggc ctgcgccgag aacttcgtca aggctatcga gctgaaccag 6186 aacggagcca tctggaaact ggacttgggc accctggagg ccatccagtg gaccaagcac 6246 tgggactccg gcatctaaga agtgatactc ccaaaaaaaa aaaaaaacat aacattagtt 6306 catagggttc gcgaaccaca agatattcac gcaaggcaat taaggctgat tcgatgcaca 6366 ctcacattct tctcctaata cgataataaa actttccatg aaaaatatgg aaaaatatat 6426 gaaaattgag aaatccaaaa aactgataaa cgctctactt aattaaaata gataaatggg 6486 agcggcagga atggcggagc atggccaagt tcctccgcca atcagtcgta aaacagaagt 6546 cgtggaaagc ggatagaaag aatgttcgat ttgacgggca agcatgtctg ctatgtggcg 6606 gattgcggag gaattgcact ggagaccagc aaggttctca tgaccaagaa tatagcggtg 6666 agtgagcggg aagctcggtt tctgtccaga tcgaactcaa aactagtcca gccagtcgct 6726 gtcgaaacta attaagttaa tgagtttttc atgttagttt cgcgctgagc aacaattaag 6786 tttatgtttc agttcggctt agatttcgct gaaggacttg ccactttcaa tcaatacttt 6846 agaacaaaat caaaactcat tctaatagct tggtgttcat ctttttttta atgataagca 6906 ttttgtcgtt tatacttttt atatttcgat attaaaccac ctatgaagtt cattttaatc 6966 gccagataag caatatattg tgtaaatatt tgtattcttt atcaggaaat tcagggagac 7026 ggggaagtta ctatctacta aaagccaaac aatttcttac agttttactc tctctactct 7086 agaaactggc cattttacag agtacggaaa atccccaggc catcgctcag ttgcagtcga 7146 taaagccgag tacccaaata tttttctgga cctacgacgt gaccatggca agggaagata 7206 tgaagaagta cttcgatgag gtgatggtcc aaatgg 7242 76 256 PRT Drosophila melanogaster 76 Met Ser Phe Thr Leu Thr Asn Lys Asn Val Ile Phe Val Ala Gly Leu 1 5 10 15 Gly Gly Ile Gly Leu Asp Thr Ser Lys Glu Leu Leu Lys Arg Asp Leu 20 25 30 Lys Asn Leu Val Ile Leu Asp Arg Ile Glu Asn Pro Ala Ala Ile Ala 35 40 45 Glu Leu Lys Ala Ile Asn Pro Lys Val Thr Val Thr Phe Tyr Pro Tyr 50 55 60 Asp Val Thr Val Pro Ile Ala Glu Thr Thr Lys Leu Leu Lys Thr Ile 65 70 75 80 Phe Ala Gln Leu Lys Thr Val Asp Val Leu Ile Asn Gly Ala Gly Ile 85 90 95 Leu Asp Asp His Gln Ile Glu Arg Thr Ile Ala Val Asn Tyr Thr Gly 100 105 110 Leu Val Asn Thr Thr Thr Ala Ile Leu Asp Phe Trp Asp Lys Arg Lys 115 120 125 Gly Gly Pro Gly Gly Ile Ile Cys Asn Ile Gly Ser Val Thr Gly Phe 130 135 140 Asn Ala Ile Tyr Gln Val Pro Val Tyr Ser Gly Thr Lys Ala Ala Val 145 150 155 160 Val Asn Phe Thr Ser Ser Leu Ala Lys Leu Ala Pro Ile Thr Gly Val 165 170 175 Thr Ala Tyr Thr Val Asn Pro Gly Ile Thr Arg Thr Thr Leu Val His 180 185 190 Thr Phe Asn Ser Trp Leu Asp Val Glu Pro Gln Val Ala Glu Lys Leu 195 200 205 Leu Ala His Pro Thr Gln Pro Ser Leu Ala Cys Ala Glu Asn Phe Val 210 215 220 Lys Ala Ile Glu Leu Asn Gln Asn Gly Ala Ile Trp Lys Leu Asp Leu 225 230 235 240 Gly Thr Leu Glu Ala Ile Gln Trp Thr Lys His Trp Asp Ser Gly Ile 245 250 255 77 953 DNA Drosophila melanogaster CDS (81)...(716) 77 ctgtccataa atgctgaccc gtcagctgtc cagtgttcag tgatctttca gaaccgcatc 60 gttatacttg cagtctagcc atg aaa acc ttt cac tcc cta cta ttc gcc ctg 113 Met Lys Thr Phe His Ser Leu Leu Phe Ala Leu 1 5 10 att ttg gcc ctc gtg agc ctg gac ccc tct tcc ggt tac tcc cgc cgg 161 Ile Leu Ala Leu Val Ser Leu Asp Pro Ser Ser Gly Tyr Ser Arg Arg 15 20 25 aaa gat gag cgt ata att aga gaa tat aag cga gag caa gag tta aat 209 Lys Asp Glu Arg Ile Ile Arg Glu Tyr Lys Arg Glu Gln Glu Leu Asn 30 35 40 cac gcc aag atc gcc aga gat ttg gtc cat cgc gcc aat tgg gcg gct 257 His Ala Lys Ile Ala Arg Asp Leu Val His Arg Ala Asn Trp Ala Ala 45 50 55 gtg gga agt ctc tcc acc aac gaa cgg gtg aag ggg tat ccc atg gtc 305 Val Gly Ser Leu Ser Thr Asn Glu Arg Val Lys Gly Tyr Pro Met Val 60 65 70 75 aac att atc tcc ttc gac gac agc gat gct aat aac agg tcc act gga 353 Asn Ile Ile Ser Phe Asp Asp Ser Asp Ala Asn Asn Arg Ser Thr Gly 80 85 90 cgt att cga ttc ctg cta acc gat ttg gac ttc act ggt ccc gac tgg 401 Arg Ile Arg Phe Leu Leu Thr Asp Leu Asp Phe Thr Gly Pro Asp Trp 95 100 105 cag aag gac aac aag gtc aca ctc ctg ttc agt gac gag cag acc cta 449 Gln Lys Asp Asn Lys Val Thr Leu Leu Phe Ser Asp Glu Gln Thr Leu 110 115 120 aga tgc aag gag ggc gga aag gat ccc atg gag cct aca tgt gcc cgt 497 Arg Cys Lys Glu Gly Gly Lys Asp Pro Met Glu Pro Thr Cys Ala Arg 125 130 135 tcc atg atc agt gga cag gtg aag aag atg gat cct agc gac aaa agc 545 Ser Met Ile Ser Gly Gln Val Lys Lys Met Asp Pro Ser Asp Lys Ser 140 145 150 155 tat cag cca tcg ctg gat gcc tat gtg agg cgt cat cca gct gcc atc 593 Tyr Gln Pro Ser Leu Asp Ala Tyr Val Arg Arg His Pro Ala Ala Ile 160 165 170 aat tgg gta aaa gct cac aat ttc tac ctt tgc gaa ctg gaa att agc 641 Asn Trp Val Lys Ala His Asn Phe Tyr Leu Cys Glu Leu Glu Ile Ser 175 180 185 aat atc ttt gtt ccg gac ttc tat gga ggt cct cat aaa gtg agc gcc 689 Asn Ile Phe Val Pro Asp Phe Tyr Gly Gly Pro His Lys Val Ser Ala 190 195 200 tct gac tat tac gct gtt tcg aat tga tgaactcctc caaaagcaat 736 Ser Asp Tyr Tyr Ala Val Ser Asn * 205 210 ccaactgacg cctcatatac tttgtcaaaa aatgaaatga taagcatatt tgcacaacat 796 gtctgccaca aagggaaaga atgaagacaa aggcctttat gaaagccact tatcagttga 856 gctctgtgat ttcaatcgat agaacaataa ctttgatata aataaaaata cattccgttg 916 aatgtactct gacatactga aaaaaaaaaa aaaaaaa 953 78 211 PRT Drosophila melanogaster 78 Met Lys Thr Phe His Ser Leu Leu Phe Ala Leu Ile Leu Ala Leu Val 1 5 10 15 Ser Leu Asp Pro Ser Ser Gly Tyr Ser Arg Arg Lys Asp Glu Arg Ile 20 25 30 Ile Arg Glu Tyr Lys Arg Glu Gln Glu Leu Asn His Ala Lys Ile Ala 35 40 45 Arg Asp Leu Val His Arg Ala Asn Trp Ala Ala Val Gly Ser Leu Ser 50 55 60 Thr Asn Glu Arg Val Lys Gly Tyr Pro Met Val Asn Ile Ile Ser Phe 65 70 75 80 Asp Asp Ser Asp Ala Asn Asn Arg Ser Thr Gly Arg Ile Arg Phe Leu 85 90 95 Leu Thr Asp Leu Asp Phe Thr Gly Pro Asp Trp Gln Lys Asp Asn Lys 100 105 110 Val Thr Leu Leu Phe Ser Asp Glu Gln Thr Leu Arg Cys Lys Glu Gly 115 120 125 Gly Lys Asp Pro Met Glu Pro Thr Cys Ala Arg Ser Met Ile Ser Gly 130 135 140 Gln Val Lys Lys Met Asp Pro Ser Asp Lys Ser Tyr Gln Pro Ser Leu 145 150 155 160 Asp Ala Tyr Val Arg Arg His Pro Ala Ala Ile Asn Trp Val Lys Ala 165 170 175 His Asn Phe Tyr Leu Cys Glu Leu Glu Ile Ser Asn Ile Phe Val Pro 180 185 190 Asp Phe Tyr Gly Gly Pro His Lys Val Ser Ala Ser Asp Tyr Tyr Ala 195 200 205 Val Ser Asn 210 79 3271 DNA Drosophila melanogaster 79 ggt gag atg agc cac aat gat tat gaa tat ttt aac gat tac agt gtt 48 cag act cat gat aaa aac cgg tac cat gaa ggt cgc tct agc att gga 96 tat cag cca gcg atc cac aac ata gaa tat gaa aat cag aaa gga cat 144 cac gaa tct ttt gct gat gat cat gaa aat att gat cat gaa gac ttc 192 ttt gga aat acg cag gaa ata att aca ttt gct gaa gag ggt gag tag 240 att ttt att caa cac agg att aaa cgc tat cgt cga ggg caa att aaa 288 aaa taa gag acg tcg agt tcg ccg act atc aga tac ctg tta cta taa 336 ata agt aaa tat aaa ttt tat ttt ata ccc gtt ata act agt ggg tac 384 gca aac gcg aaa ttt cat aat ttt gct ggg ata tcg ttg gat att acg 432 ggg aaa taa aat gag aaa aaa att gta taa ttc ttt aat agt gag gga 480 gtg agc ctc ttg ggc gtt ttg ttg gtg aaa gga tgt gcg tga tca cag 528 tgt ttt ggt ata ccg gtg gaa att agc aag aca aac aat aaa aca aag 576 aaa aat cta aaa aaa att ttc taa agt gtg ggc gtg gca gtt ttt tgg 624 caa atc cat aaa att tta aaa gac taa taa aat tat gaa aaa ata tcc 672 aca aat ttg tca aaa aaa aaa aat tgt cgg cgg cat ttt ttg ggt cgt 720 tag agt ggg cgt ggc aaa aag ttt gtt gac aaa tcg ata aaa aaa aat 768 gta caa gac tac aga cta aaa ata tga gaa aat atc aaa act ttt tca 816 aaa gtt ttc gga ggt ttg tgg gag ttg gct caa cac acg tgc gat gcg 864 ttt agg tgt cta caa tcg gta tgc ttc tag cta gta tgc ttc taa cct 912 tct agc ttt tat agt tca tga gat cta gac gtt cca tac aga cat ggc 960 tag att gat tcg gct att gat cct gat caa gaa tat ata tac tat ata 1008 tag tcg gaa acg att ctt cct gcc ttt aac ata ctt tac tat act ata 1056 cta tac tat ata cta tag tta cta tta tat att att ata tta taa tat 1104 ata tta agt ata tag taa tag taa tat aca gta ata aaa ata aga aag 1152 agt gat cgt gcc caa aaa aca tta ggc aaa tca gta gaa acc ggc ttg 1200 tgg cgg tat gta tgg cat acc gat aga aat tgg caa aac aaa tta tat 1248 aat gac taa aaa taa aaa cgc ttt ctt aaa agt gta agc gtg gta gct 1296 tgt gag agt gct tct tct ctg aaa ttt tta att ttt tat ctg aac ttt 1344 cta gat ttt aca ggc cct gac atc gat acg atc gat atg gac aaa aac 1392 gaa tat att taa ata cgt tat att aaa gtt gaa gta gtt gag gtt ttt 1440 tta atg tct acc aat atc cgc cat ctc taa cgc cta caa aac gct taa 1488 gag tac att tta tat aca aag ctt tga ttt tta gtt aac tgt aca aaa 1536 att gta aac ttt aca ata tat ctg ttt ttg aag aaa aaa aac aat gca 1584 ata tat ttt ttt gta att cgc tca tat gac tgt gtg tag tct tct tat 1632 tca ttg gca taa tat att tta ttt taa gga acc caa tat cga caa tat 1680 cga ata ctc gag ttt tcg gcc caa aac cgc cgt gtt ccc agc caa aaa 1728 tta tcc ata cga agc gct cag att cat ata agg ata gac aag cca cat 1776 agt ctt tgg att gaa aaa gca aaa agt ttg cca gaa aaa cat tta cta 1824 aac act aaa cga aaa tgg gga gca aat aaa cca cat cac aga ata aaa 1872 ata tgg gta ttt cag tta tca act tca atc aat att acg gag aag gta 1920 aca tta aaa ata ttg ttt ttt aac taa ttt tta aag ctc ttt att tat 1968 agg gaa tag ata agg cta tta tct tcc gag ctt ctt ttc aag ttg acc 2016 cga aaa att tag gat ggc aaa aat tcg att taa ccg ata caa tcc gtg 2064 aat ggt acg gcc ata cca gtc acg aaa aac tgc gtt tgt tga ttg att 2112 gta ctg gat gcg gtg gtc gtt act ctt tgc acc tct ttc aaa ctt cga 2160 aac tga gag gaa act cat ccg act act tga gta cga atc cca atc gac 2208 cat ttc tcg tac ttc ata cag aat cat cca gaa caa ggc gtg tta ggc 2256 gac gcg ccg tag att gtg gcg gag ctt taa atg gac agt gct gta aag 2304 aat ctt ttt atg ttt ctt tca aag cac tag gtt ggg atg act gga tta 2352 ttg cac cca gag gat att ttg caa act att gcc gag gcg att gca cgg 2400 gct ctt tta gga cgc cag aca ctt tcc aaa cat ttc atg ctc att tta 2448 tag agg agt atc gga aaa tgg gac tta tga atg gaa tgc gac cgt gtt 2496 gcg ccc caa taa aat tct cta gta tgt cct taa tat att atg gtg atg 2544 atg gta tta tta aac gag act tgc caa aaa tgg ttg ttg atg agt gtg 2592 gtt gcc ctt ag cgtaagtaat aacacaatct tattttacct aattctataa 2643 caaaattact tttacagctt gccttacaat tttatatttt ccgtccgata gaaataaaaa 2703 tatatgtgtc ctaactgagc gccaatctct taacgaaatc ttttactttt aagttaaaag 2763 tacataaata tatacataaa gtagactcaa ttttatttta tacttagcta tgctggtgac 2823 aatatttgta tatttacgaa caaatccaaa ttgaggagtg cctaaattac gttaatgaaa 2883 tatttgtaca ttttaagaat atttcaaaaa tcactaaaat atctttgtac taataaaatt 2943 cgatattttt aattattact ttatttaata tgttcaaatt aacataagta taaaacgaaa 3003 tgatttaata acctataaca tgagcaaagc gtcgcgcttt ttttattgtc aaaacattaa 3063 ttttactaac ttgaaaagct tatatcacag acatgtaata aatatttcat attacagttt 3123 aataaagtat taatataagg attactatat gaaataaaat aaatttatgt tttcttgtta 3183 ataaacaatg taagcaatgt aaacgctgga ctggggtgct cgcataatat gcttatccct 3243 ctgtgtgttt gccagttgtt ctgtatga 3271 80 373 PRT Drosophila melanogaster 80 Gly Glu Met Ser His Asn Asp Tyr Glu Tyr Phe Asn Asp Tyr Ser Val 1 5 10 15 Gln Thr His Asp Lys Asn Arg Tyr His Glu Gly Arg Ser Ser Ile Gly 20 25 30 Tyr Gln Pro Ala Ile His Asn Ile Glu Tyr Glu Asn Gln Lys Gly His 35 40 45 His Glu Ser Phe Ala Asp Asp His Glu Asn Ile Asp His Glu Asp Phe 50 55 60 Phe Gly Asn Thr Gln Glu Ile Ile Thr Phe Ala Glu Glu Gly Thr Gln 65 70 75 80 Tyr Arg Gln Tyr Arg Ile Leu Glu Phe Ser Ala Gln Asn Arg Arg Val 85 90 95 Pro Ser Gln Lys Leu Ser Ile Arg Ser Ala Gln Ile His Ile Arg Ile 100 105 110 Asp Lys Pro His Ser Leu Trp Ile Glu Lys Ala Lys Ser Leu Pro Glu 115 120 125 Lys His Leu Leu Asn Thr Lys Arg Lys Trp Gly Ala Asn Lys Pro His 130 135 140 His Arg Ile Lys Ile Trp Val Phe Gln Leu Ser Thr Ser Ile Asn Ile 145 150 155 160 Thr Glu Lys Gly Ile Asp Lys Ala Ile Ile Phe Arg Ala Ser Phe Gln 165 170 175 Val Asp Pro Lys Asn Leu Gly Trp Gln Lys Phe Asp Leu Thr Asp Thr 180 185 190 Ile Arg Glu Trp Tyr Gly His Thr Ser His Glu Lys Leu Arg Leu Leu 195 200 205 Ile Asp Cys Thr Gly Cys Gly Gly Arg Tyr Ser Leu His Leu Phe Gln 210 215 220 Thr Ser Lys Leu Arg Gly Asn Ser Ser Asp Tyr Leu Ser Thr Asn Pro 225 230 235 240 Asn Arg Pro Phe Leu Val Leu His Thr Glu Ser Ser Arg Thr Arg Arg 245 250 255 Val Arg Arg Arg Ala Val Asp Cys Gly Gly Ala Leu Asn Gly Gln Cys 260 265 270 Cys Lys Glu Ser Phe Tyr Val Ser Phe Lys Ala Leu Gly Trp Asp Asp 275 280 285 Trp Ile Ile Ala Pro Arg Gly Tyr Phe Ala Asn Tyr Cys Arg Gly Asp 290 295 300 Cys Thr Gly Ser Phe Arg Thr Pro Asp Thr Phe Gln Thr Phe His Ala 305 310 315 320 His Phe Ile Glu Glu Tyr Arg Lys Met Gly Leu Met Asn Gly Met Arg 325 330 335 Pro Cys Cys Ala Pro Ile Lys Phe Ser Ser Met Ser Leu Ile Tyr Tyr 340 345 350 Gly Asp Asp Gly Ile Ile Lys Arg Asp Leu Pro Lys Met Val Val Asp 355 360 365 Glu Cys Gly Cys Pro 370 81 836 PRT Drosophila melanogaster 81 Met Asp Ser Leu Ile Thr Ile Val Asn Lys Leu Gln Asp Ala Phe Thr 1 5 10 15 Ser Leu Gly Val His Met Gln Leu Asp Leu Pro Gln Ile Ala Val Val 20 25 30 Gly Gly Gln Ser Ala Gly Lys Ser Ser Val Leu Glu Asn Phe Val Gly 35 40 45 Lys Asp Phe Leu Pro Arg Gly Ser Gly Ile Val Thr Arg Arg Pro Leu 50 55 60 Ile Leu Gln Leu Ile Asn Gly Val Thr Glu Tyr Gly Glu Phe Leu His 65 70 75 80 Ile Lys Gly Lys Lys Phe Ser Ser Phe Asp Glu Ile Arg Lys Glu Ile 85 90 95 Glu Asp Glu Thr Asp Arg Val Thr Gly Ser Asn Lys Gly Ile Ser Asn 100 105 110 Ile Pro Ile Asn Leu Arg Val Tyr Ser Pro His Val Leu Asn Leu Thr 115 120 125 Leu Ile Asp Leu Pro Gly Leu Thr Lys Val Ala Ile Gly Asp Gln Pro 130 135 140 Val Asp Ile Glu Gln Gln Ile Lys Gln Met Ile Phe Gln Phe Ile Arg 145 150 155 160 Lys Glu Thr Cys Leu Ile Leu Ala Val Thr Pro Ala Asn Thr Asp Leu 165 170 175 Ala Asn Ser Asp Ala Leu Lys Leu Ala Lys Glu Val Asp Pro Gln Gly 180 185 190 Val Arg Thr Ile Gly Val Ile Thr Lys Leu Asp Leu Met Asp Glu Gly 195 200 205 Thr Asp Ala Arg Asp Ile Leu Glu Asn Lys Leu Leu Pro Leu Arg Arg 210 215 220 Gly Tyr Ile Gly Val Val Asn Arg Ser Gln Lys Asp Ile Glu Gly Arg 225 230 235 240 Lys Asp Ile His Gln Ala Leu Ala Ala Glu Arg Lys Phe Phe Leu Ser 245 250 255 His Pro Ser Tyr Arg His Met Ala Asp Arg Leu Gly Thr Pro Tyr Leu 260 265 270 Gln Arg Val Leu Asn Gln Gln Leu Thr Asn His Ile Arg Asp Thr Leu 275 280 285 Pro Gly Leu Arg Asp Lys Leu Gln Lys Gln Met Leu Thr Leu Glu Lys 290 295 300 Glu Val Glu Glu Phe Lys His Phe Gln Pro Gly Asp Ala Ser Ile Lys 305 310 315 320 Thr Lys Ala Met Leu Gln Met Ile Gln Gln Leu Gln Ser Asp Phe Glu 325 330 335 Arg Thr Ile Glu Gly Ser Gly Ser Ala Leu Val Asn Thr Asn Glu Leu 340 345 350 Ser Gly Gly Ala Lys Ile Asn Arg Ile Phe His Glu Arg Leu Arg Phe 355 360 365 Glu Ile Val Lys Met Ala Cys Asp Glu Lys Glu Leu Arg Arg Glu Ile 370 375 380 Ser Phe Ala Ile Arg Asn Ile His Gly Ile Arg Val Gly Leu Phe Thr 385 390 395 400 Pro Asp Met Ala Phe Glu Ala Ile Val Lys Arg Gln Ile Ala Leu Leu 405 410 415 Lys Glu Pro Val Ile Lys Cys Val Asp Leu Val Val Gln Glu Leu Ser 420 425 430 Val Val Val Arg Met Cys Thr Ala Lys Met Ser Arg Tyr Pro Arg Leu 435 440 445 Arg Glu Glu Thr Glu Arg Ile Ile Thr Thr His Val Arg Gln Arg Glu 450 455 460 His Ser Cys Lys Glu Gln Ile Leu Leu Leu Ile Asp Phe Glu Leu Ala 465 470 475 480 Tyr Met Asn Thr Asn His Glu Asp Phe Ile Gly Phe Ala Asn Ala Gln 485 490 495 Asn Lys Ser Glu Asn Ala Asn Lys Thr Gly Thr Arg Gln Leu Gly Asn 500 505 510 Gln Val Ile Arg Lys Gly His Met Val Ile Gln Asn Leu Gly Ile Met 515 520 525 Lys Gly Gly Ser Arg Pro Tyr Trp Phe Val Leu Thr Ser Glu Ser Ile 530 535 540 Ser Trp Tyr Lys Asp Glu Asp Glu Lys Glu Lys Lys Phe Met Leu Pro 545 550 555 560 Leu Asp Gly Leu Lys Leu Arg Asp Ile Glu Gln Gly Phe Met Ser Met 565 570 575 Ser Arg Arg Val Thr Phe Ala Leu Phe Ser Pro Asp Gly Arg Asn Val 580 585 590 Tyr Lys Asp Tyr Lys Gln Leu Glu Leu Ser Cys Glu Thr Val Glu Asp 595 600 605 Val Glu Ser Trp Lys Ala Ser Phe Leu Arg Ala Gly Val Tyr Pro Glu 610 615 620 Lys Gln Glu Thr Gln Glu Asn Gly Asp Glu Glu Gly Gln Glu Gln Lys 625 630 635 640 Ser Ala Ser Glu Glu Ser Ser Ser Asp Pro Gln Leu Glu Arg Gln Val 645 650 655 Glu Thr Ile Arg Asn Leu Val Asp Ser Tyr Met Lys Ile Val Thr Lys 660 665 670 Thr Thr Arg Asp Met Val Pro Lys Ala Ile Met Met Leu Ile Ile Asn 675 680 685 Asn Ala Lys Asp Phe Ile Asn Gly Glu Leu Leu Ala His Leu Tyr Ala 690 695 700 Ser Gly Asp Gln Ala Gln Met Met Glu Glu Ser Ala Glu Ser Ala Thr 705 710 715 720 Arg Arg Glu Glu Met Leu Arg Met Tyr Arg Ala Cys Lys Asp Ala Leu 725 730 735 Gln Ile Ile Gly Asp Val Ser Met Ala Thr Val Ser Ser Pro Leu Pro 740 745 750 Pro Pro Val Lys Asn Asp Trp Leu Pro Ser Gly Leu Asp Asn Pro Arg 755 760 765 Leu Ser Pro Pro Ser Pro Gly Gly Val Arg Gly Lys Pro Gly Pro Pro 770 775 780 Ala Gln Ser Ser Leu Gly Gly Arg Asn Pro Pro Leu Pro Pro Ser Thr 785 790 795 800 Gly Arg Pro Ala Pro Ala Ile Pro Asn Arg Pro Gly Gly Gly Ala Pro 805 810 815 Pro Leu Pro Gly Gly Arg Pro Gly Gly Ser Leu Pro Pro Pro Met Leu 820 825 830 Pro Ser Arg Arg 835 82 82 000 83 477 PRT Drosophila melanogaster 83 Met Ser Asp Asp Phe Lys Arg Ser Ile Arg Gly Lys Ser Ala Ser Ala 1 5 10 15 Ile Ala Gln Ala Leu Leu Ser Glu Ser Glu Lys Asn Ile Lys Thr Ala 20 25 30 Lys Lys Glu Glu Glu Asp Tyr Ile Ala Gln Thr Leu Val Arg Ser Ser 35 40 45 Arg Ala Val Ser Arg Ala Arg Ser Arg Ser Ser Ser Pro Leu Asp Gly 50 55 60 Gln Tyr Arg Ala His Ala Leu His Ile Glu Leu Met Asp Asp Arg Leu 65 70 75 80 Val Asp Lys Leu Asp His Arg Val Ser Ser Ser Leu His Asn Val Lys 85 90 95 Arg Gln Leu Ser Thr Leu Asn Gln Arg Thr Val Glu Phe Tyr Ala Asp 100 105 110 Ser Ser Lys Gln Thr Ser Ile Glu Ile Glu Gln Leu Asn Ala Arg Val 115 120 125 Ile Glu Ala Glu Thr Arg Leu Lys Thr Glu Val Thr Arg Ile Lys Lys 130 135 140 Lys Leu Gln Ile Gln Ile Thr Glu Leu Glu Met Ser Leu Asp Val Ala 145 150 155 160 Asn Lys Thr Asn Ile Asp Leu Gln Lys Val Ile Lys Lys Gln Ser Leu 165 170 175 Gln Leu Thr Glu Leu Gln Ala His Tyr Glu Asp Val Gln Arg Gln Leu 180 185 190 Gln Ala Thr Leu Asp Gln Tyr Ala Val Ala Gln Arg Arg Leu Ala Gly 195 200 205 Leu Asn Gly Glu Leu Glu Glu Val Arg Ser His Leu Asp Ser Ala Asn 210 215 220 Arg Ala Lys Arg Thr Val Glu Leu Gln Tyr Glu Glu Ala Ala Ser Arg 225 230 235 240 Ile Asn Glu Leu Thr Thr Ala Asn Val Ser Leu Val Ser Ile Lys Ser 245 250 255 Lys Leu Glu Gln Glu Leu Ser Val Val Ala Ser Asp Tyr Glu Glu Val 260 265 270 Ser Lys Glu Leu Arg Ile Ser Asp Glu Arg Tyr Gln Lys Val Gln Val 275 280 285 Glu Leu Lys His Val Val Glu Gln Val His Glu Glu Gln Glu Arg Ile 290 295 300 Val Lys Leu Glu Thr Ile Lys Lys Ser Leu Glu Val Glu Val Lys Asn 305 310 315 320 Leu Ser Ile Arg Leu Glu Glu Val Glu Leu Asn Ala Val Ala Gly Ser 325 330 335 Lys Arg Ile Ile Ser Lys Leu Glu Ala Arg Ile Arg Asp Leu Glu Leu 340 345 350 Glu Leu Glu Glu Glu Lys Arg Arg His Ala Glu Thr Ile Lys Ile Leu 355 360 365 Arg Lys Lys Glu Arg Thr Val Lys Glu Val Leu Val Gln Cys Glu Glu 370 375 380 Asp Gln Lys Asn Leu Ile Leu Leu Gln Asp Ala Leu Asp Lys Ser Thr 385 390 395 400 Ala Lys Ile Asn Ile Tyr Arg Arg Gln Leu Ser Glu Gln Glu Gly Val 405 410 415 Ser Gln Gln Thr Thr Thr Arg Val Arg Arg Phe Gln Arg Glu Leu Glu 420 425 430 Ala Ala Glu Asp Arg Ala Asp Thr Ala Glu Ser Ser Leu Asn Ile Ile 435 440 445 Arg Ala Lys His Arg Thr Phe Val Thr Thr Ser Thr Val Pro Gly Ser 450 455 460 Gln Val Tyr Ile Gln Glu Thr Thr Arg Thr Ile Thr Glu 465 470 475 84 84 000 85 2221 DNA Drosophila melanogaster CDS (71)...(1729) 85 ggcacgagga ggagttggct taggaattgt acaattactt ctgcggcgga acagttaaaa 60 agcattaaaa atg tcg att ttt tcc gcc cgc ctg gcg tcc tcg gtc gcc 109 Met Ser Ile Phe Ser Ala Arg Leu Ala Ser Ser Val Ala 1 5 10 cgt aac ctg ccc aag gct gcc aac cag gtc gcc tgc aaa gcc gct tat 157 Arg Asn Leu Pro Lys Ala Ala Asn Gln Val Ala Cys Lys Ala Ala Tyr 15 20 25 ccc gcc gcc agt ctt gct gcc cgc aag ctc cat gtg gcc agc acg cag 205 Pro Ala Ala Ser Leu Ala Ala Arg Lys Leu His Val Ala Ser Thr Gln 30 35 40 45 cgt agc gcc gag atc tcc aac atc ctg gag gag cgc atc ctg ggc gtc 253 Arg Ser Ala Glu Ile Ser Asn Ile Leu Glu Glu Arg Ile Leu Gly Val 50 55 60 gcc ccc aag gct gat ctg gag gag acc ggc cgt gtg ctg agc atc ggt 301 Ala Pro Lys Ala Asp Leu Glu Glu Thr Gly Arg Val Leu Ser Ile Gly 65 70 75 gac ggt atc gcc cgt gtc tac ggt ctg aac aac atc cag gcc gat gag 349 Asp Gly Ile Ala Arg Val Tyr Gly Leu Asn Asn Ile Gln Ala Asp Glu 80 85 90 atg gtg gag ttc tcc tcc gga ctg aag ggc atg gcc ctt aac ttg gag 397 Met Val Glu Phe Ser Ser Gly Leu Lys Gly Met Ala Leu Asn Leu Glu 95 100 105 ccc gac aac gtc ggt gtt gtg gtc ttc ggt aac gat aag ctg atc aag 445 Pro Asp Asn Val Gly Val Val Val Phe Gly Asn Asp Lys Leu Ile Lys 110 115 120 125 cag ggc gat atc gtc aag cgt acc ggt gcc atc gtg gat gtg ccc gtc 493 Gln Gly Asp Ile Val Lys Arg Thr Gly Ala Ile Val Asp Val Pro Val 130 135 140 ggt gat gag ctg ctg ggt cgc gtc gtc gat gcc ctg gga aat gcc atc 541 Gly Asp Glu Leu Leu Gly Arg Val Val Asp Ala Leu Gly Asn Ala Ile 145 150 155 gac ggc aag ggt gcc atc aac acc aag gac cgt ttc cgt gtg gga atc 589 Asp Gly Lys Gly Ala Ile Asn Thr Lys Asp Arg Phe Arg Val Gly Ile 160 165 170 aag gcc ccc ggc atc atc ccc cgt gtg tcc gtg cgc gag ccc atg cag 637 Lys Ala Pro Gly Ile Ile Pro Arg Val Ser Val Arg Glu Pro Met Gln 175 180 185 acc ggt atc aag gcc gtc gac tcc ctg gtg ccc atc ggt cgt ggt cag 685 Thr Gly Ile Lys Ala Val Asp Ser Leu Val Pro Ile Gly Arg Gly Gln 190 195 200 205 cgt gag ctg atc att ggc gat cgt cag act ggt aag acc gct ctg gcc 733 Arg Glu Leu Ile Ile Gly Asp Arg Gln Thr Gly Lys Thr Ala Leu Ala 210 215 220 atc gac acc atc atc aac cag aag cgc ttc aac gag gcc cag gac gag 781 Ile Asp Thr Ile Ile Asn Gln Lys Arg Phe Asn Glu Ala Gln Asp Glu 225 230 235 tcc aag aag ctg tac tgc atc tac gtc gcc att ggc cag aag cgt tcc 829 Ser Lys Lys Leu Tyr Cys Ile Tyr Val Ala Ile Gly Gln Lys Arg Ser 240 245 250 acc gtc gct cag atc gtg aag cgt ctg acc gac tcc ggc gcc atg ggc 877 Thr Val Ala Gln Ile Val Lys Arg Leu Thr Asp Ser Gly Ala Met Gly 255 260 265 tac tcc gtg atc gtg tcg gcc acc gcc tcc gac gct gct ccc ctg cag 925 Tyr Ser Val Ile Val Ser Ala Thr Ala Ser Asp Ala Ala Pro Leu Gln 270 275 280 285 tac ttg gcc ccc tac tcc gga tgc gcc atg gga gag tac ttc cgc gac 973 Tyr Leu Ala Pro Tyr Ser Gly Cys Ala Met Gly Glu Tyr Phe Arg Asp 290 295 300 aag ggc aag cac gcc ctg atc atc tac gat gat ttg tcc aag cag gct 1021 Lys Gly Lys His Ala Leu Ile Ile Tyr Asp Asp Leu Ser Lys Gln Ala 305 310 315 gtg gcc tac cgt cag atg tcc ctg ttg ctg cgt cgt ccc cca ggt cgt 1069 Val Ala Tyr Arg Gln Met Ser Leu Leu Leu Arg Arg Pro Pro Gly Arg 320 325 330 gag gcc tac ccc ggt gat gtg ttc tac ctg cat tcg cgt ctg ctg gag 1117 Glu Ala Tyr Pro Gly Asp Val Phe Tyr Leu His Ser Arg Leu Leu Glu 335 340 345 cgt gcc gcc aag atg tcc ccc gcc atg gga ggc ggt tcc ctg act gct 1165 Arg Ala Ala Lys Met Ser Pro Ala Met Gly Gly Gly Ser Leu Thr Ala 350 355 360 365 ctg ccc gtg atc gag acc cag gct ggc gat gtg tcc gcc tac att cca 1213 Leu Pro Val Ile Glu Thr Gln Ala Gly Asp Val Ser Ala Tyr Ile Pro 370 375 380 acc aac gtc atc tcg att acc gat gga cag atc ttc ttg gag acc gag 1261 Thr Asn Val Ile Ser Ile Thr Asp Gly Gln Ile Phe Leu Glu Thr Glu 385 390 395 ttg ttc tac aag ggt atc cgc cct gcc att aac gtc ggt ctg tcc gtg 1309 Leu Phe Tyr Lys Gly Ile Arg Pro Ala Ile Asn Val Gly Leu Ser Val 400 405 410 tcc cgt gtg ggt tcc gct gcc cag acc aag gcc atg aag cag gtg gcc 1357 Ser Arg Val Gly Ser Ala Ala Gln Thr Lys Ala Met Lys Gln Val Ala 415 420 425 ggt tcc atg aag ctg gag ttg gcc cag tac cgt gag gtc gct gcc ttc 1405 Gly Ser Met Lys Leu Glu Leu Ala Gln Tyr Arg Glu Val Ala Ala Phe 430 435 440 445 gcc cag ttc ggt tcc gat ctg gat gcc gcc acc cag cag ctg ctg aac 1453 Ala Gln Phe Gly Ser Asp Leu Asp Ala Ala Thr Gln Gln Leu Leu Asn 450 455 460 cgt ggt gtg cgc ctc act gag ctg ctc aag cag ggt cag tac gtg ccc 1501 Arg Gly Val Arg Leu Thr Glu Leu Leu Lys Gln Gly Gln Tyr Val Pro 465 470 475 atg gcc att gag gat cag gtc gcc gtc atc tac tgc ggt gtg cgc ggt 1549 Met Ala Ile Glu Asp Gln Val Ala Val Ile Tyr Cys Gly Val Arg Gly 480 485 490 cat ctg gac aag atg gat ccc gcc aag atc acc aag ttc gag aag gag 1597 His Leu Asp Lys Met Asp Pro Ala Lys Ile Thr Lys Phe Glu Lys Glu 495 500 505 ttc ttg cag cac atc aag acc tcc gag cag gct ctg ctc gac acc atc 1645 Phe Leu Gln His Ile Lys Thr Ser Glu Gln Ala Leu Leu Asp Thr Ile 510 515 520 525 gcc aag gac ggt gct atc tcg gag gcg tcc gat gcc aag ctg aag gac 1693 Ala Lys Asp Gly Ala Ile Ser Glu Ala Ser Asp Ala Lys Leu Lys Asp 530 535 540 att gtt gcc aag ttc atg tcc acc ttc cag ggt taa gcatcagcag 1739 Ile Val Ala Lys Phe Met Ser Thr Phe Gln Gly * 545 550 caaacgaaga ataggagtag cggtgcgtgc aattaacatc atagatggct cgaataatgc 1799 tgccatccct gcgcgccatc aacaacaaca aataacaacc tacaacttaa gataaacaga 1859 aattataaaa caaaaacaag agaaactcct ttgcacattc atgttttcct gcgcatcctg 1919 cctggatcga tcgccttcaa tccgtggtcc ggtgccatta tgccgcaaaa tcttgggaga 1979 gagagagccg gccctggcta cggagtggac ggcgagcccc agccggatgc actttcgagt 2039 ttcagcgaaa aatcttgcgg ttcctccaac ctccaaaacg atagataatc ccccgttgtt 2099 cttctaccta aaatatgtaa tcaatatttg aaatattgta atcgagaata gtcgaaataa 2159 tgtctgagga tgattgtgaa acgatgatga attaaaacga tactaaatta tactctaaaa 2219 aa 2221 86 552 PRT Drosophila melanogaster 86 Met Ser Ile Phe Ser Ala Arg Leu Ala Ser Ser Val Ala Arg Asn Leu 1 5 10 15 Pro Lys Ala Ala Asn Gln Val Ala Cys Lys Ala Ala Tyr Pro Ala Ala 20 25 30 Ser Leu Ala Ala Arg Lys Leu His Val Ala Ser Thr Gln Arg Ser Ala 35 40 45 Glu Ile Ser Asn Ile Leu Glu Glu Arg Ile Leu Gly Val Ala Pro Lys 50 55 60 Ala Asp Leu Glu Glu Thr Gly Arg Val Leu Ser Ile Gly Asp Gly Ile 65 70 75 80 Ala Arg Val Tyr Gly Leu Asn Asn Ile Gln Ala Asp Glu Met Val Glu 85 90 95 Phe Ser Ser Gly Leu Lys Gly Met Ala Leu Asn Leu Glu Pro Asp Asn 100 105 110 Val Gly Val Val Val Phe Gly Asn Asp Lys Leu Ile Lys Gln Gly Asp 115 120 125 Ile Val Lys Arg Thr Gly Ala Ile Val Asp Val Pro Val Gly Asp Glu 130 135 140 Leu Leu Gly Arg Val Val Asp Ala Leu Gly Asn Ala Ile Asp Gly Lys 145 150 155 160 Gly Ala Ile Asn Thr Lys Asp Arg Phe Arg Val Gly Ile Lys Ala Pro 165 170 175 Gly Ile Ile Pro Arg Val Ser Val Arg Glu Pro Met Gln Thr Gly Ile 180 185 190 Lys Ala Val Asp Ser Leu Val Pro Ile Gly Arg Gly Gln Arg Glu Leu 195 200 205 Ile Ile Gly Asp Arg Gln Thr Gly Lys Thr Ala Leu Ala Ile Asp Thr 210 215 220 Ile Ile Asn Gln Lys Arg Phe Asn Glu Ala Gln Asp Glu Ser Lys Lys 225 230 235 240 Leu Tyr Cys Ile Tyr Val Ala Ile Gly Gln Lys Arg Ser Thr Val Ala 245 250 255 Gln Ile Val Lys Arg Leu Thr Asp Ser Gly Ala Met Gly Tyr Ser Val 260 265 270 Ile Val Ser Ala Thr Ala Ser Asp Ala Ala Pro Leu Gln Tyr Leu Ala 275 280 285 Pro Tyr Ser Gly Cys Ala Met Gly Glu Tyr Phe Arg Asp Lys Gly Lys 290 295 300 His Ala Leu Ile Ile Tyr Asp Asp Leu Ser Lys Gln Ala Val Ala Tyr 305 310 315 320 Arg Gln Met Ser Leu Leu Leu Arg Arg Pro Pro Gly Arg Glu Ala Tyr 325 330 335 Pro Gly Asp Val Phe Tyr Leu His Ser Arg Leu Leu Glu Arg Ala Ala 340 345 350 Lys Met Ser Pro Ala Met Gly Gly Gly Ser Leu Thr Ala Leu Pro Val 355 360 365 Ile Glu Thr Gln Ala Gly Asp Val Ser Ala Tyr Ile Pro Thr Asn Val 370 375 380 Ile Ser Ile Thr Asp Gly Gln Ile Phe Leu Glu Thr Glu Leu Phe Tyr 385 390 395 400 Lys Gly Ile Arg Pro Ala Ile Asn Val Gly Leu Ser Val Ser Arg Val 405 410 415 Gly Ser Ala Ala Gln Thr Lys Ala Met Lys Gln Val Ala Gly Ser Met 420 425 430 Lys Leu Glu Leu Ala Gln Tyr Arg Glu Val Ala Ala Phe Ala Gln Phe 435 440 445 Gly Ser Asp Leu Asp Ala Ala Thr Gln Gln Leu Leu Asn Arg Gly Val 450 455 460 Arg Leu Thr Glu Leu Leu Lys Gln Gly Gln Tyr Val Pro Met Ala Ile 465 470 475 480 Glu Asp Gln Val Ala Val Ile Tyr Cys Gly Val Arg Gly His Leu Asp 485 490 495 Lys Met Asp Pro Ala Lys Ile Thr Lys Phe Glu Lys Glu Phe Leu Gln 500 505 510 His Ile Lys Thr Ser Glu Gln Ala Leu Leu Asp Thr Ile Ala Lys Asp 515 520 525 Gly Ala Ile Ser Glu Ala Ser Asp Ala Lys Leu Lys Asp Ile Val Ala 530 535 540 Lys Phe Met Ser Thr Phe Gln Gly 545 550 87 2012 DNA Drosophila melanogaster CDS (346)...(1740) 87 gggacgagcg gatcattggc agaaaggcaa gaagaagagc aagcaatcac ctgcgaaaaa 60 gaaataaatt gccgagaaat gttaaagacg ttgtagtgcg gcaatttggc gtgcgttgaa 120 atgactcctg gcgaccagta gaagcattta gtgaccacgc tactcatccc ccgtcccccg 180 ccgcagcggc tgagccactg gacaaaggcc agcttacaac aaaaacaaca ccagggagac 240 agtcacgggc cattggcacc cacacaagca cacacacgca cgcactgcgt aaaagcaaac 300 aacaacgata acaactggga acggaggcct tcaactgcag gaaag atg gaa gct gat 357 Met Glu Ala Asp 1 gga ctg acc aac gaa cag acg gag aag gtc ctc cag ttc cag gat ctc 405 Gly Leu Thr Asn Glu Gln Thr Glu Lys Val Leu Gln Phe Gln Asp Leu 5 10 15 20 acc ggc atc gag gat atg aac gtc tgc cgc gac gtc cta atc aga cac 453 Thr Gly Ile Glu Asp Met Asn Val Cys Arg Asp Val Leu Ile Arg His 25 30 35 caa tgg gat ctc gag gtg gcc ttc cag gag cag cta aat atc cgc gag 501 Gln Trp Asp Leu Glu Val Ala Phe Gln Glu Gln Leu Asn Ile Arg Glu 40 45 50 ggg cgt ccg acc atg ttc gcc gcc tct aca gat gtc cga gcg ccc gct 549 Gly Arg Pro Thr Met Phe Ala Ala Ser Thr Asp Val Arg Ala Pro Ala 55 60 65 gtc ctt aac gat cgc ttc ctg cag cag gtg ttt tcc gcc aac atg cct 597 Val Leu Asn Asp Arg Phe Leu Gln Gln Val Phe Ser Ala Asn Met Pro 70 75 80 ggc gga agg acg gtt agc cgg gtg ccc agt ggc ccc gtt ccc cgc agc 645 Gly Gly Arg Thr Val Ser Arg Val Pro Ser Gly Pro Val Pro Arg Ser 85 90 95 100 ttc acc ggc atc att gga tat gtg att aac ttc gtg ttt cag tac ttc 693 Phe Thr Gly Ile Ile Gly Tyr Val Ile Asn Phe Val Phe Gln Tyr Phe 105 110 115 tac tcc acg ctg acg agc atc gtc agt gcg ttt gtg aac ctg ggc gga 741 Tyr Ser Thr Leu Thr Ser Ile Val Ser Ala Phe Val Asn Leu Gly Gly 120 125 130 gga aac gaa gcc cgc ttg gtg aca gat cca ctg ggg gat gtg atg aag 789 Gly Asn Glu Ala Arg Leu Val Thr Asp Pro Leu Gly Asp Val Met Lys 135 140 145 ttt att agg gaa tac tac gaa agg tat ccc gag cac ccg gtc ttc tat 837 Phe Ile Arg Glu Tyr Tyr Glu Arg Tyr Pro Glu His Pro Val Phe Tyr 150 155 160 cag ggc aca tat gcc cag gcc ctt aac gat gct aag cag gag ctg cgt 885 Gln Gly Thr Tyr Ala Gln Ala Leu Asn Asp Ala Lys Gln Glu Leu Arg 165 170 175 180 ttt cta atc gtt tac ctg cac aag gat ccc gcc aag aac cca gat gtg 933 Phe Leu Ile Val Tyr Leu His Lys Asp Pro Ala Lys Asn Pro Asp Val 185 190 195 gaa tcc ttc tgc cgc aac aca ctg tcg gcg aga tcg gtc att gac tac 981 Glu Ser Phe Cys Arg Asn Thr Leu Ser Ala Arg Ser Val Ile Asp Tyr 200 205 210 att aac aca cac acc ctg tta tgg gga tgc gac gta gcc acg ccg gag 1029 Ile Asn Thr His Thr Leu Leu Trp Gly Cys Asp Val Ala Thr Pro Glu 215 220 225 ggc tac cgg gtg atg cag tcg ata acg gtg cgc agc tac cca acg atg 1077 Gly Tyr Arg Val Met Gln Ser Ile Thr Val Arg Ser Tyr Pro Thr Met 230 235 240 gtg atg atc agc ctt cga gca aat cgc atg atg atc gtc gga cgt ttc 1125 Val Met Ile Ser Leu Arg Ala Asn Arg Met Met Ile Val Gly Arg Phe 245 250 255 260 gag ggg gat tgc acg ccc gaa gag ctg ctc cgt cgc ctc cag tcg gtg 1173 Glu Gly Asp Cys Thr Pro Glu Glu Leu Leu Arg Arg Leu Gln Ser Val 265 270 275 acg aac gcc aac gag gtg tgg ctg agt caa gcg cgt gca gat cgc tta 1221 Thr Asn Ala Asn Glu Val Trp Leu Ser Gln Ala Arg Ala Asp Arg Leu 280 285 290 gag cgg aat ttt aca cag act tta aga aga cag cag gac gag gca tac 1269 Glu Arg Asn Phe Thr Gln Thr Leu Arg Arg Gln Gln Asp Glu Ala Tyr 295 300 305 gag cag agt ttg ctt gcg gac gag gag aag gag cgt cag cgg caa agg 1317 Glu Gln Ser Leu Leu Ala Asp Glu Glu Lys Glu Arg Gln Arg Gln Arg 310 315 320 gag cgg gat gcc gtc cgg cag gcg gag gag gct gtg gaa caa gct cga 1365 Glu Arg Asp Ala Val Arg Gln Ala Glu Glu Ala Val Glu Gln Ala Arg 325 330 335 340 cgc gat gtc gag ctt cgc aag gag gag att gcc cgg caa aag atc gag 1413 Arg Asp Val Glu Leu Arg Lys Glu Glu Ile Ala Arg Gln Lys Ile Glu 345 350 355 ctg gcc aca ttg gtg cca tcc gag ccg gca gcg gat gct gtc ggc gcc 1461 Leu Ala Thr Leu Val Pro Ser Glu Pro Ala Ala Asp Ala Val Gly Ala 360 365 370 att gca gtt gta ttc aag ctg ccc agt gga aca cgc cta gag cgc cga 1509 Ile Ala Val Val Phe Lys Leu Pro Ser Gly Thr Arg Leu Glu Arg Arg 375 380 385 ttt aac cag acg gat tcg gtg ctt gac gtt tat cac tat ctc ttc tgc 1557 Phe Asn Gln Thr Asp Ser Val Leu Asp Val Tyr His Tyr Leu Phe Cys 390 395 400 cat ccc gat tcg ccg gat gag ttc gag atc aca acg aat ttc ccg aag 1605 His Pro Asp Ser Pro Asp Glu Phe Glu Ile Thr Thr Asn Phe Pro Lys 405 410 415 420 cgc gtg ctc ttc tcg aaa gcg aac ttg gat gcc gct ggg gaa aca ggc 1653 Arg Val Leu Phe Ser Lys Ala Asn Leu Asp Ala Ala Gly Glu Thr Gly 425 430 435 aca gcc aag gag acg ctg acc aaa acc ctt cag gcc gta gga ctc aag 1701 Thr Ala Lys Glu Thr Leu Thr Lys Thr Leu Gln Ala Val Gly Leu Lys 440 445 450 aac cgc gag ttg ctg ttc gtt aac gat ctg gaa gcg taa ccaagcacca 1750 Asn Arg Glu Leu Leu Phe Val Asn Asp Leu Glu Ala * 455 460 cctaccaaat gaaagcatca tcgtgttacc agcagcaacg cattaccaaa acttgaacca 1810 agttttaagt agtcgtagtc aacgtccaag tcggaggata aatagagaaa ctattcctaa 1870 tttaatcatg aacattaata ttaataattg tttaacaatg tttaggttga ttctctgatc 1930 tgcccaacac catcaaccaa cccgtttctc aaacattcac acatcagtaa aaatcgtttc 1990 cagaataatt tccgtgaacc ac 2012 88 464 PRT Drosophila melanogaster 88 Met Glu Ala Asp Gly Leu Thr Asn Glu Gln Thr Glu Lys Val Leu Gln 1 5 10 15 Phe Gln Asp Leu Thr Gly Ile Glu Asp Met Asn Val Cys Arg Asp Val 20 25 30 Leu Ile Arg His Gln Trp Asp Leu Glu Val Ala Phe Gln Glu Gln Leu 35 40 45 Asn Ile Arg Glu Gly Arg Pro Thr Met Phe Ala Ala Ser Thr Asp Val 50 55 60 Arg Ala Pro Ala Val Leu Asn Asp Arg Phe Leu Gln Gln Val Phe Ser 65 70 75 80 Ala Asn Met Pro Gly Gly Arg Thr Val Ser Arg Val Pro Ser Gly Pro 85 90 95 Val Pro Arg Ser Phe Thr Gly Ile Ile Gly Tyr Val Ile Asn Phe Val 100 105 110 Phe Gln Tyr Phe Tyr Ser Thr Leu Thr Ser Ile Val Ser Ala Phe Val 115 120 125 Asn Leu Gly Gly Gly Asn Glu Ala Arg Leu Val Thr Asp Pro Leu Gly 130 135 140 Asp Val Met Lys Phe Ile Arg Glu Tyr Tyr Glu Arg Tyr Pro Glu His 145 150 155 160 Pro Val Phe Tyr Gln Gly Thr Tyr Ala Gln Ala Leu Asn Asp Ala Lys 165 170 175 Gln Glu Leu Arg Phe Leu Ile Val Tyr Leu His Lys Asp Pro Ala Lys 180 185 190 Asn Pro Asp Val Glu Ser Phe Cys Arg Asn Thr Leu Ser Ala Arg Ser 195 200 205 Val Ile Asp Tyr Ile Asn Thr His Thr Leu Leu Trp Gly Cys Asp Val 210 215 220 Ala Thr Pro Glu Gly Tyr Arg Val Met Gln Ser Ile Thr Val Arg Ser 225 230 235 240 Tyr Pro Thr Met Val Met Ile Ser Leu Arg Ala Asn Arg Met Met Ile 245 250 255 Val Gly Arg Phe Glu Gly Asp Cys Thr Pro Glu Glu Leu Leu Arg Arg 260 265 270 Leu Gln Ser Val Thr Asn Ala Asn Glu Val Trp Leu Ser Gln Ala Arg 275 280 285 Ala Asp Arg Leu Glu Arg Asn Phe Thr Gln Thr Leu Arg Arg Gln Gln 290 295 300 Asp Glu Ala Tyr Glu Gln Ser Leu Leu Ala Asp Glu Glu Lys Glu Arg 305 310 315 320 Gln Arg Gln Arg Glu Arg Asp Ala Val Arg Gln Ala Glu Glu Ala Val 325 330 335 Glu Gln Ala Arg Arg Asp Val Glu Leu Arg Lys Glu Glu Ile Ala Arg 340 345 350 Gln Lys Ile Glu Leu Ala Thr Leu Val Pro Ser Glu Pro Ala Ala Asp 355 360 365 Ala Val Gly Ala Ile Ala Val Val Phe Lys Leu Pro Ser Gly Thr Arg 370 375 380 Leu Glu Arg Arg Phe Asn Gln Thr Asp Ser Val Leu Asp Val Tyr His 385 390 395 400 Tyr Leu Phe Cys His Pro Asp Ser Pro Asp Glu Phe Glu Ile Thr Thr 405 410 415 Asn Phe Pro Lys Arg Val Leu Phe Ser Lys Ala Asn Leu Asp Ala Ala 420 425 430 Gly Glu Thr Gly Thr Ala Lys Glu Thr Leu Thr Lys Thr Leu Gln Ala 435 440 445 Val Gly Leu Lys Asn Arg Glu Leu Leu Phe Val Asn Asp Leu Glu Ala 450 455 460 89 2910 DNA Drosophila melanogaster CDS (131)...(1999) 89 ccccccagtc tctgaaacca tcgcaacaaa ttcgattaca cgttccagcg tcggccgcgt 60 tttcctgctg caagatactt cattttcaaa tttgtgcatt ttccgttcgt tataaaaagt 120 ycagcgtgaac atg tcg agc aaa tcc cga cgt gct ggc acc gcc acg ccg 169 Met Ser Ser Lys Ser Arg Arg Ala Gly Thr Ala Thr Pro 1 5 10 cag ccc ggc aac acc tcc acc ccc cgg ccg cca tcg gcg ggt ccg cag 217 Gln Pro Gly Asn Thr Ser Thr Pro Arg Pro Pro Ser Ala Gly Pro Gln 15 20 25 ccg ccg ccg ccg tcc act cac tcg cag acg gcc tcc agc ccc ctc agc 265 Pro Pro Pro Pro Ser Thr His Ser Gln Thr Ala Ser Ser Pro Leu Ser 30 35 40 45 ccc acc cgg cac tcg cgc gtg gcc gag aag gtg gag ctg cag aac ctg 313 Pro Thr Arg His Ser Arg Val Ala Glu Lys Val Glu Leu Gln Asn Leu 50 55 60 aac gat cgc ctg gcc acc tac att gac cgg gtg cgc aac ctg gag acg 361 Asn Asp Arg Leu Ala Thr Tyr Ile Asp Arg Val Arg Asn Leu Glu Thr 65 70 75 gag aac tcc cgc ctc acc atc gag gtg cag acc acc agg gac acg gtc 409 Glu Asn Ser Arg Leu Thr Ile Glu Val Gln Thr Thr Arg Asp Thr Val 80 85 90 aca cgc gag acc acc aac atc aag aac atc ttc gag gcc gag ctg ctg 457 Thr Arg Glu Thr Thr Asn Ile Lys Asn Ile Phe Glu Ala Glu Leu Leu 95 100 105 gag acg cgc cgt ctg ctc gat gac aca gct agg gat cgc gct cgt gcc 505 Glu Thr Arg Arg Leu Leu Asp Asp Thr Ala Arg Asp Arg Ala Arg Ala 110 115 120 125 gag atc gat atc aag cgt ctc tgg gag agg aac gag gag ctc aag aac 553 Glu Ile Asp Ile Lys Arg Leu Trp Glu Arg Asn Glu Glu Leu Lys Asn 130 135 140 aag ctg gac aag aag acc aag gag tgc acc act gct gag ggc aat gtc 601 Lys Leu Asp Lys Lys Thr Lys Glu Cys Thr Thr Ala Glu Gly Asn Val 145 150 155 cgc atg tac gag tcg cgc gcc aac gag ctg aac aac aaa tac aac cag 649 Arg Met Tyr Glu Ser Arg Ala Asn Glu Leu Asn Asn Lys Tyr Asn Gln 160 165 170 gcc aac gcc gat cgg aag aag ctt aac gaa gac ctg aat gag gcg cta 697 Ala Asn Ala Asp Arg Lys Lys Leu Asn Glu Asp Leu Asn Glu Ala Leu 175 180 185 aag gag ctg gag aga ctg cgt aag cag ttc gag gaa acg cgg aag aac 745 Lys Glu Leu Glu Arg Leu Arg Lys Gln Phe Glu Glu Thr Arg Lys Asn 190 195 200 205 ctg gaa cag gag aca ctg tcg cgc gtt gac ctg gag aac acc att cag 793 Leu Glu Gln Glu Thr Leu Ser Arg Val Asp Leu Glu Asn Thr Ile Gln 210 215 220 agt ctg cgc gag gag ctc tcg ttc aag gat cag atc cat tcg cag gag 841 Ser Leu Arg Glu Glu Leu Ser Phe Lys Asp Gln Ile His Ser Gln Glu 225 230 235 atc aat gag tcg cgc cgc atc aaa cag aca gag tat agc gag atc gac 889 Ile Asn Glu Ser Arg Arg Ile Lys Gln Thr Glu Tyr Ser Glu Ile Asp 240 245 250 ggt cgc ctc agc tcc gag tac gat gcc aag ttg aag cag tcg ctg cag 937 Gly Arg Leu Ser Ser Glu Tyr Asp Ala Lys Leu Lys Gln Ser Leu Gln 255 260 265 gac gtg cgc gcc cag tac gag gag cag atg cag att aat cgc gat gaa 985 Asp Val Arg Ala Gln Tyr Glu Glu Gln Met Gln Ile Asn Arg Asp Glu 270 275 280 285 atc cag tcc ctc atc gag gac aag atc caa cga ctg caa gag gcc gcc 1033 Ile Gln Ser Leu Ile Glu Asp Lys Ile Gln Arg Leu Gln Glu Ala Ala 290 295 300 gca cgc aca tcc aat tcc acg cac aag tcc atc gag gag ctg cgc tcc 1081 Ala Arg Thr Ser Asn Ser Thr His Lys Ser Ile Glu Glu Leu Arg Ser 305 310 315 act cgt gtg cgt atc gat gcg ctc aac gcc aat atc aac gaa ctg gag 1129 Thr Arg Val Arg Ile Asp Ala Leu Asn Ala Asn Ile Asn Glu Leu Glu 320 325 330 caa gcc aat gcc gac ctc aat gcg cgg atc cgt gat ctg gag cgc cag 1177 Gln Ala Asn Ala Asp Leu Asn Ala Arg Ile Arg Asp Leu Glu Arg Gln 335 340 345 ctg gac aac gat cgc gaa cgc cac ggt caa gag ata gac ctt ctc gag 1225 Leu Asp Asn Asp Arg Glu Arg His Gly Gln Glu Ile Asp Leu Leu Glu 350 355 360 365 aag gag ctc att cgg ctg cgc gaa gag atg acg caa cag ctc aag gag 1273 Lys Glu Leu Ile Arg Leu Arg Glu Glu Met Thr Gln Gln Leu Lys Glu 370 375 380 tac cag gac ctt atg gac atc aag gtc tcc ctg gat ttg gaa atc gcc 1321 Tyr Gln Asp Leu Met Asp Ile Lys Val Ser Leu Asp Leu Glu Ile Ala 385 390 395 gca tac gac aag ctg ctg gtg ggc gag gag gct cgt ttg aac atc acc 1369 Ala Tyr Asp Lys Leu Leu Val Gly Glu Glu Ala Arg Leu Asn Ile Thr 400 405 410 cca gcc acc aac acg gcc aca gtg cag tcc ttt agc cag tcg ctg cgc 1417 Pro Ala Thr Asn Thr Ala Thr Val Gln Ser Phe Ser Gln Ser Leu Arg 415 420 425 aac tcc acg cga gcc acg cca tcg cgt cgc act ccc tct gct gcc gtg 1465 Asn Ser Thr Arg Ala Thr Pro Ser Arg Arg Thr Pro Ser Ala Ala Val 430 435 440 445 aag cgc aaa cgc gcc gtg gtc gac gag tcg gag gat cac agc gtc gcc 1513 Lys Arg Lys Arg Ala Val Val Asp Glu Ser Glu Asp His Ser Val Ala 450 455 460 gat tac tat gtg tcc gcc agt gcc aag ggc aac gtg gag atc aag gag 1561 Asp Tyr Tyr Val Ser Ala Ser Ala Lys Gly Asn Val Glu Ile Lys Glu 465 470 475 atc gat ccc gag ggc aag ttc gta agg ctg ttc aac aag ggc agc gag 1609 Ile Asp Pro Glu Gly Lys Phe Val Arg Leu Phe Asn Lys Gly Ser Glu 480 485 490 gag gtg gcc atc ggt ggc tgg cag ctg cag cgg cta atc aac gag aaa 1657 Glu Val Ala Ile Gly Gly Trp Gln Leu Gln Arg Leu Ile Asn Glu Lys 495 500 505 ggt cct tcg acc act tac aag ttc cat cga tcg gtg agg atc gag cca 1705 Gly Pro Ser Thr Thr Tyr Lys Phe His Arg Ser Val Arg Ile Glu Pro 510 515 520 525 aat ggc gtg atc acc gtt tgg tcg gcg gac acc aag gcc tcg cac gag 1753 Asn Gly Val Ile Thr Val Trp Ser Ala Asp Thr Lys Ala Ser His Glu 530 535 540 ccg cca tcg agc ctt gtg atg aag tca cag aag tgg gtc tcc gcc gac 1801 Pro Pro Ser Ser Leu Val Met Lys Ser Gln Lys Trp Val Ser Ala Asp 545 550 555 aac act agg acg att ttg ctg aac tcc gag ggc gag gcc gtg gcc aat 1849 Asn Thr Arg Thr Ile Leu Leu Asn Ser Glu Gly Glu Ala Val Ala Asn 560 565 570 ctg gat cgc atc aag cgc att gtg tcc caa cac aca tcc tcc tcc cgg 1897 Leu Asp Arg Ile Lys Arg Ile Val Ser Gln His Thr Ser Ser Ser Arg 575 580 585 ctg agt cgt cgt cgc agc gtg acc gcc gtg gac ggc aat gag cag ctc 1945 Leu Ser Arg Arg Arg Ser Val Thr Ala Val Asp Gly Asn Glu Gln Leu 590 595 600 605 tac cac cag cag ggc gat cct cag cag tca aac gag aag tgc gcc att 1993 Tyr His Gln Gln Gly Asp Pro Gln Gln Ser Asn Glu Lys Cys Ala Ile 610 615 620 atg taa aatcaaacgc agcacaacac aactctttcc tctttgctga acaagacaaa 2049 Met * caaaataagc acacacgaag atcataattt agaacccaaa aacacacaca ccgaggtaca 2109 gagatttatt attcagctaa gttatttttt gtggccgcag cgccaattat ttaatcaaaa 2169 acatttgtgt agaagacatc ctgaatcttc gtttcgtttg tacactttcg ttttcccttt 2229 cttaacaaat tctatagttt attgtttcga tgttttgtat tccgcgttaa ctaatctatg 2289 taaactttat ttggtataaa ctggagagag catgttgcct cttttttatg ccacatagaa 2349 tttacgtaag acgttcactt cttgtattcg cggcggcaag aactttgaaa atattgcaag 2409 aaaaacaata ttttcgatgt acttacgctc cacctacata tttagtaaat tagtttttaa 2469 gctatatccc agataaacca gtcgacgtca gcaaacaaca acaaaaaaaa aacaacgagc 2529 acgtagtgag tgatttataa gatacaggtc aagaggatta actaaagaaa acaaacgctc 2589 ctgacaacag acatatttat ttaagtattt tttgtacaat caacataaaa tacattatac 2649 attatacata catatacata catacattat atatatagac tcatgcctac ggaagtgaca 2709 accagccagc aatatatatt tttagccatg gccatagggt ttacgatcca ccaaaacggc 2769 tttctcccgg tttagaccgg ttcttgagct tctgcagaca ttttaccgga cgagcaacaa 2829 taagaacagc aacaacaaca acagcattaa cagcaacaac acaaatgtac aacaattaca 2889 aaagcaacaa agaagaaaaa a 2910 90 622 PRT Drosophila melanogaster 90 Met Ser Ser Lys Ser Arg Arg Ala Gly Thr Ala Thr Pro Gln Pro Gly 1 5 10 15 Asn Thr Ser Thr Pro Arg Pro Pro Ser Ala Gly Pro Gln Pro Pro Pro 20 25 30 Pro Ser Thr His Ser Gln Thr Ala Ser Ser Pro Leu Ser Pro Thr Arg 35 40 45 His Ser Arg Val Ala Glu Lys Val Glu Leu Gln Asn Leu Asn Asp Arg 50 55 60 Leu Ala Thr Tyr Ile Asp Arg Val Arg Asn Leu Glu Thr Glu Asn Ser 65 70 75 80 Arg Leu Thr Ile Glu Val Gln Thr Thr Arg Asp Thr Val Thr Arg Glu 85 90 95 Thr Thr Asn Ile Lys Asn Ile Phe Glu Ala Glu Leu Leu Glu Thr Arg 100 105 110 Arg Leu Leu Asp Asp Thr Ala Arg Asp Arg Ala Arg Ala Glu Ile Asp 115 120 125 Ile Lys Arg Leu Trp Glu Arg Asn Glu Glu Leu Lys Asn Lys Leu Asp 130 135 140 Lys Lys Thr Lys Glu Cys Thr Thr Ala Glu Gly Asn Val Arg Met Tyr 145 150 155 160 Glu Ser Arg Ala Asn Glu Leu Asn Asn Lys Tyr Asn Gln Ala Asn Ala 165 170 175 Asp Arg Lys Lys Leu Asn Glu Asp Leu Asn Glu Ala Leu Lys Glu Leu 180 185 190 Glu Arg Leu Arg Lys Gln Phe Glu Glu Thr Arg Lys Asn Leu Glu Gln 195 200 205 Glu Thr Leu Ser Arg Val Asp Leu Glu Asn Thr Ile Gln Ser Leu Arg 210 215 220 Glu Glu Leu Ser Phe Lys Asp Gln Ile His Ser Gln Glu Ile Asn Glu 225 230 235 240 Ser Arg Arg Ile Lys Gln Thr Glu Tyr Ser Glu Ile Asp Gly Arg Leu 245 250 255 Ser Ser Glu Tyr Asp Ala Lys Leu Lys Gln Ser Leu Gln Asp Val Arg 260 265 270 Ala Gln Tyr Glu Glu Gln Met Gln Ile Asn Arg Asp Glu Ile Gln Ser 275 280 285 Leu Ile Glu Asp Lys Ile Gln Arg Leu Gln Glu Ala Ala Ala Arg Thr 290 295 300 Ser Asn Ser Thr His Lys Ser Ile Glu Glu Leu Arg Ser Thr Arg Val 305 310 315 320 Arg Ile Asp Ala Leu Asn Ala Asn Ile Asn Glu Leu Glu Gln Ala Asn 325 330 335 Ala Asp Leu Asn Ala Arg Ile Arg Asp Leu Glu Arg Gln Leu Asp Asn 340 345 350 Asp Arg Glu Arg His Gly Gln Glu Ile Asp Leu Leu Glu Lys Glu Leu 355 360 365 Ile Arg Leu Arg Glu Glu Met Thr Gln Gln Leu Lys Glu Tyr Gln Asp 370 375 380 Leu Met Asp Ile Lys Val Ser Leu Asp Leu Glu Ile Ala Ala Tyr Asp 385 390 395 400 Lys Leu Leu Val Gly Glu Glu Ala Arg Leu Asn Ile Thr Pro Ala Thr 405 410 415 Asn Thr Ala Thr Val Gln Ser Phe Ser Gln Ser Leu Arg Asn Ser Thr 420 425 430 Arg Ala Thr Pro Ser Arg Arg Thr Pro Ser Ala Ala Val Lys Arg Lys 435 440 445 Arg Ala Val Val Asp Glu Ser Glu Asp His Ser Val Ala Asp Tyr Tyr 450 455 460 Val Ser Ala Ser Ala Lys Gly Asn Val Glu Ile Lys Glu Ile Asp Pro 465 470 475 480 Glu Gly Lys Phe Val Arg Leu Phe Asn Lys Gly Ser Glu Glu Val Ala 485 490 495 Ile Gly Gly Trp Gln Leu Gln Arg Leu Ile Asn Glu Lys Gly Pro Ser 500 505 510 Thr Thr Tyr Lys Phe His Arg Ser Val Arg Ile Glu Pro Asn Gly Val 515 520 525 Ile Thr Val Trp Ser Ala Asp Thr Lys Ala Ser His Glu Pro Pro Ser 530 535 540 Ser Leu Val Met Lys Ser Gln Lys Trp Val Ser Ala Asp Asn Thr Arg 545 550 555 560 Thr Ile Leu Leu Asn Ser Glu Gly Glu Ala Val Ala Asn Leu Asp Arg 565 570 575 Ile Lys Arg Ile Val Ser Gln His Thr Ser Ser Ser Arg Leu Ser Arg 580 585 590 Arg Arg Ser Val Thr Ala Val Asp Gly Asn Glu Gln Leu Tyr His Gln 595 600 605 Gln Gly Asp Pro Gln Gln Ser Asn Glu Lys Cys Ala Ile Met 610 615 620 91 1456 DNA Homo sapiens CDS (72)...(1199) 91 atgcacttga gcagggaaga aatccacaag gactcaccag tctcctggtc tgcagagaag 60 acagaatcaa c atg agc aca gca gga aaa gta atc aaa tgc aaa gca gct 110 Met Ser Thr Ala Gly Lys Val Ile Lys Cys Lys Ala Ala 1 5 10 gtg cta tgg gag tta aag aaa ccc ttt tcc att gag gag gtg gag gtt 158 Val Leu Trp Glu Leu Lys Lys Pro Phe Ser Ile Glu Glu Val Glu Val 15 20 25 gca cct cct aag gcc cat gaa gtt cgt att aag atg gtg gct gta gga 206 Ala Pro Pro Lys Ala His Glu Val Arg Ile Lys Met Val Ala Val Gly 30 35 40 45 atc tgt ggc aca gat gac cac gtg gtt agt ggt acc atg gtg acc cca 254 Ile Cys Gly Thr Asp Asp His Val Val Ser Gly Thr Met Val Thr Pro 50 55 60 ctt cct gtg att tta ggc cat gag gca gcc ggc atc gtg gag agt gtt 302 Leu Pro Val Ile Leu Gly His Glu Ala Ala Gly Ile Val Glu Ser Val 65 70 75 gga gaa ggg gtg act aca gtc aaa cca ggt gat aaa gtc atc cca ctc 350 Gly Glu Gly Val Thr Thr Val Lys Pro Gly Asp Lys Val Ile Pro Leu 80 85 90 gct att cct cag tgt gga aaa tgc aga att tgt aaa aac ccg gag agc 398 Ala Ile Pro Gln Cys Gly Lys Cys Arg Ile Cys Lys Asn Pro Glu Ser 95 100 105 aac tac tgc ttg aaa aac gat gta agc aat cct cag ggg acc ctg cag 446 Asn Tyr Cys Leu Lys Asn Asp Val Ser Asn Pro Gln Gly Thr Leu Gln 110 115 120 125 gat ggc acc agc agg ttc acc tgc agg agg aag ccc atc cac cac ttc 494 Asp Gly Thr Ser Arg Phe Thr Cys Arg Arg Lys Pro Ile His His Phe 130 135 140 ctt ggc atc agc acc ttc tca cag tac aca gtg gtg gat gaa aat gca 542 Leu Gly Ile Ser Thr Phe Ser Gln Tyr Thr Val Val Asp Glu Asn Ala 145 150 155 gta gcc aaa att gat gca gcc tcg cct cta gag aaa gtc tgt ctc att 590 Val Ala Lys Ile Asp Ala Ala Ser Pro Leu Glu Lys Val Cys Leu Ile 160 165 170 ggc tgt gga ttt tca act ggt tat ggg tct gca gtc aat gtt gcc aag 638 Gly Cys Gly Phe Ser Thr Gly Tyr Gly Ser Ala Val Asn Val Ala Lys 175 180 185 gtc acc cca ggc tct acc tgt gct gtg ttt ggc ctg gga ggg gtc ggc 686 Val Thr Pro Gly Ser Thr Cys Ala Val Phe Gly Leu Gly Gly Val Gly 190 195 200 205 cta tct gct att atg ggc tgt aaa gca gct ggg gca gcc aga atc att 734 Leu Ser Ala Ile Met Gly Cys Lys Ala Ala Gly Ala Ala Arg Ile Ile 210 215 220 gcg gtg gac atc aac aag gac aaa ttt gca aag gcc aaa gag ttg ggt 782 Ala Val Asp Ile Asn Lys Asp Lys Phe Ala Lys Ala Lys Glu Leu Gly 225 230 235 gcc act gaa tgc atc aac cct caa gac tac aag aaa ccc atc cag gag 830 Ala Thr Glu Cys Ile Asn Pro Gln Asp Tyr Lys Lys Pro Ile Gln Glu 240 245 250 gtg cta aag gaa atg act gat gga ggt gtg gat ttt tca ttt gaa gtc 878 Val Leu Lys Glu Met Thr Asp Gly Gly Val Asp Phe Ser Phe Glu Val 255 260 265 atc ggt cgg ctt gac acc atg atg gct tcc ctg tta tgt tgt cat gag 926 Ile Gly Arg Leu Asp Thr Met Met Ala Ser Leu Leu Cys Cys His Glu 270 275 280 285 gca tgt ggc aca agt gtc atc gta ggg gta cct cct gat tcc caa aac 974 Ala Cys Gly Thr Ser Val Ile Val Gly Val Pro Pro Asp Ser Gln Asn 290 295 300 ctc tca atg aac cct atg ctg cta ctg act gga cgt acc tgg aag gga 1022 Leu Ser Met Asn Pro Met Leu Leu Leu Thr Gly Arg Thr Trp Lys Gly 305 310 315 gct att ctt ggt ggc ttt aaa agt aaa gaa tgt gtc cca aaa ctt gtg 1070 Ala Ile Leu Gly Gly Phe Lys Ser Lys Glu Cys Val Pro Lys Leu Val 320 325 330 gct gat ttt atg gct aag aag ttt tca ttg gat gca tta ata acc cat 1118 Ala Asp Phe Met Ala Lys Lys Phe Ser Leu Asp Ala Leu Ile Thr His 335 340 345 gtt tta cct ttt gaa aaa ata aat gaa gga ttt gac ctg ctt cac tct 1166 Val Leu Pro Phe Glu Lys Ile Asn Glu Gly Phe Asp Leu Leu His Ser 350 355 360 365 ggg aaa agt atc cgt acc att ctg atg ttt tga gacaatacag atgttttccc 1219 Gly Lys Ser Ile Arg Thr Ile Leu Met Phe * 370 375 ttgtggcagt cttcagcctc ctctacccta catgatctgg agcaacagct gggaaatatc 1279 attaattctg ctcatcacag attttatcaa taaattacat ttgggggctt tccaaagaaa 1339 tggaaattga tgtaaaatta tttttcaagc aaatgtttaa aatccaaatg agaactaaat 1399 aaagtgttga acatcagctg gggaattgaa gccaataaac cttccttctt aaccatt 1456 92 375 PRT Homo sapiens 92 Met Ser Thr Ala Gly Lys Val Ile Lys Cys Lys Ala Ala Val Leu Trp 1 5 10 15 Glu Leu Lys Lys Pro Phe Ser Ile Glu Glu Val Glu Val Ala Pro Pro 20 25 30 Lys Ala His Glu Val Arg Ile Lys Met Val Ala Val Gly Ile Cys Gly 35 40 45 Thr Asp Asp His Val Val Ser Gly Thr Met Val Thr Pro Leu Pro Val 50 55 60 Ile Leu Gly His Glu Ala Ala Gly Ile Val Glu Ser Val Gly Glu Gly 65 70 75 80 Val Thr Thr Val Lys Pro Gly Asp Lys Val Ile Pro Leu Ala Ile Pro 85 90 95 Gln Cys Gly Lys Cys Arg Ile Cys Lys Asn Pro Glu Ser Asn Tyr Cys 100 105 110 Leu Lys Asn Asp Val Ser Asn Pro Gln Gly Thr Leu Gln Asp Gly Thr 115 120 125 Ser Arg Phe Thr Cys Arg Arg Lys Pro Ile His His Phe Leu Gly Ile 130 135 140 Ser Thr Phe Ser Gln Tyr Thr Val Val Asp Glu Asn Ala Val Ala Lys 145 150 155 160 Ile Asp Ala Ala Ser Pro Leu Glu Lys Val Cys Leu Ile Gly Cys Gly 165 170 175 Phe Ser Thr Gly Tyr Gly Ser Ala Val Asn Val Ala Lys Val Thr Pro 180 185 190 Gly Ser Thr Cys Ala Val Phe Gly Leu Gly Gly Val Gly Leu Ser Ala 195 200 205 Ile Met Gly Cys Lys Ala Ala Gly Ala Ala Arg Ile Ile Ala Val Asp 210 215 220 Ile Asn Lys Asp Lys Phe Ala Lys Ala Lys Glu Leu Gly Ala Thr Glu 225 230 235 240 Cys Ile Asn Pro Gln Asp Tyr Lys Lys Pro Ile Gln Glu Val Leu Lys 245 250 255 Glu Met Thr Asp Gly Gly Val Asp Phe Ser Phe Glu Val Ile Gly Arg 260 265 270 Leu Asp Thr Met Met Ala Ser Leu Leu Cys Cys His Glu Ala Cys Gly 275 280 285 Thr Ser Val Ile Val Gly Val Pro Pro Asp Ser Gln Asn Leu Ser Met 290 295 300 Asn Pro Met Leu Leu Leu Thr Gly Arg Thr Trp Lys Gly Ala Ile Leu 305 310 315 320 Gly Gly Phe Lys Ser Lys Glu Cys Val Pro Lys Leu Val Ala Asp Phe 325 330 335 Met Ala Lys Lys Phe Ser Leu Asp Ala Leu Ile Thr His Val Leu Pro 340 345 350 Phe Glu Lys Ile Asn Glu Gly Phe Asp Leu Leu His Ser Gly Lys Ser 355 360 365 Ile Arg Thr Ile Leu Met Phe 370 375 93 1989 DNA Homo sapiens CDS (34)...(696) 93 gcggccgccg ggactcttcc tggagacacc gcc atg gcc ggg cta tcc cgc ggg 54 Met Ala Gly Leu Ser Arg Gly 1 5 tcc gcg cgc gca ctg ctc gcc gcc ctg ctg gcg tcg acg ctg ttg gcg 102 Ser Ala Arg Ala Leu Leu Ala Ala Leu Leu Ala Ser Thr Leu Leu Ala 10 15 20 ctg ctc gtg tcg ccc gcg cgg ggt cgc ggc ggc cgg gac cac ggg gac 150 Leu Leu Val Ser Pro Ala Arg Gly Arg Gly Gly Arg Asp His Gly Asp 25 30 35 tgg gac gag gcc tcc cgg ctg ccg ccg cta cca ccc cgc gag gac gcg 198 Trp Asp Glu Ala Ser Arg Leu Pro Pro Leu Pro Pro Arg Glu Asp Ala 40 45 50 55 gcg cgc gtg gcc cgc ttc gtg acg cac gtc tcc gac tgg ggc gct ctg 246 Ala Arg Val Ala Arg Phe Val Thr His Val Ser Asp Trp Gly Ala Leu 60 65 70 gcc acc atc tcc acg ctg gag gcg gtg cgc ggc cgg ccc ttc gcc gac 294 Ala Thr Ile Ser Thr Leu Glu Ala Val Arg Gly Arg Pro Phe Ala Asp 75 80 85 gtc ctc tcg ctc agc gac ggg ccc ccg ggc gcg ggc agc ggc gtg ccc 342 Val Leu Ser Leu Ser Asp Gly Pro Pro Gly Ala Gly Ser Gly Val Pro 90 95 100 tat ttc tac ctg agc ccg ctg cag ctc tcc gtg agc aac ctg cag gag 390 Tyr Phe Tyr Leu Ser Pro Leu Gln Leu Ser Val Ser Asn Leu Gln Glu 105 110 115 aat cca tat gct aca ctg acc atg act ttg gca cag acc aac ttc tgc 438 Asn Pro Tyr Ala Thr Leu Thr Met Thr Leu Ala Gln Thr Asn Phe Cys 120 125 130 135 aag aaa cat gga ttt gat cca caa agt ccc ctt tgt gtt cac ata atg 486 Lys Lys His Gly Phe Asp Pro Gln Ser Pro Leu Cys Val His Ile Met 140 145 150 ctg tca gga act gtg acc aag gtg aat gaa aca gaa atg gat att gca 534 Leu Ser Gly Thr Val Thr Lys Val Asn Glu Thr Glu Met Asp Ile Ala 155 160 165 aag cat tcg tta ttc att cga cac cct gag atg aaa acc tgg cct tcc 582 Lys His Ser Leu Phe Ile Arg His Pro Glu Met Lys Thr Trp Pro Ser 170 175 180 agc cat aat tgg ttc ttt gct aag ttg aat ata acc aat atc tgg gtc 630 Ser His Asn Trp Phe Phe Ala Lys Leu Asn Ile Thr Asn Ile Trp Val 185 190 195 ctg gac tac ttt ggt gga cca aaa atc gtg aca cca gaa gaa tat tat 678 Leu Asp Tyr Phe Gly Gly Pro Lys Ile Val Thr Pro Glu Glu Tyr Tyr 200 205 210 215 aat gtc aca gtt cag tga agcagactgt ggtgaattta gcaacactta 726 Asn Val Thr Val Gln * 220 tgaagtttct taaagtggct catacacact taaaaggctt aatgtttctc tggaaagcgt 786 cccagaatat tagccagttt tctgtcacat gctggtttgt ttgcttgctt gtttacttgc 846 ttgtttacca atagagttga cctgttattg gatttcctgg aagatgtggt agctactttt 906 ttcctatttt gaagccattt tcgtagagaa atatccttca ctataatcaa ataagttttg 966 tcccatcaat tccaaagatg tttccagtgg tgctcttgaa gaggaatgag taccagtttt 1026 aaattgccca ttggcatttg aaggtagttg agtatgtgtt ctttattcct agaagccact 1086 gtgcttggta gagtgcatca ctcaccacag ctgcctcctg agctgcctga gcctggtgca 1146 aaaggattgg cccccattat ggtgcttctg aataaatctt gccaagatag acaaacaatg 1206 atgaaactca gatggagctt cctactcatg ttgatttatg tctcacaatc ctgggtattg 1266 ttaattcaac atagggtgaa actatttctg ataaagaact tttgaaaaac tttttatact 1326 ctaaagtgat actcagaaca aaagaaagtc ataaaactcc tgaatttaat ttccccacct 1386 aagtcgagac agtattatca aaacacatgt gcacacagat tattttttgg ctccaaaact 1446 ggattgcaaa agaaagagga gaagaatatt ttgtgtgttc ctggtattct tttataagta 1506 aagtttaccc aggcatggac cagcttcagc cagggacaaa atcccctccc aaaccactct 1566 ccacagcttt ttaaaaatac ttctactctt aacaattacc taaggcttcc tcaactgccc 1626 caaatctctt aatagcttct agtgctgcta caatctaagt caggtcacca gagggaagag 1686 aacatggcat taaaagaatc acatcttcag aagagaagac actaatatta ttacccatat 1746 acatgatttc agaagatgac ataagattcc tcttaaagag gaaatgtcag gaatcaagcc 1806 actgaatcct taaagagaaa agttgaatat gagtcattgt gtctgaaaac tgcaaagtga 1866 acttaactga gatccagcaa acaggttctg tttaagaaaa ataatttata ctaaatttag 1926 taaaatggac ttcttattca aagcatcaat aattaaaaga attattttaa aaaaaaaaaa 1986 aaa 1989 94 220 PRT Homo sapiens 94 Met Ala Gly Leu Ser Arg Gly Ser Ala Arg Ala Leu Leu Ala Ala Leu 1 5 10 15 Leu Ala Ser Thr Leu Leu Ala Leu Leu Val Ser Pro Ala Arg Gly Arg 20 25 30 Gly Gly Arg Asp His Gly Asp Trp Asp Glu Ala Ser Arg Leu Pro Pro 35 40 45 Leu Pro Pro Arg Glu Asp Ala Ala Arg Val Ala Arg Phe Val Thr His 50 55 60 Val Ser Asp Trp Gly Ala Leu Ala Thr Ile Ser Thr Leu Glu Ala Val 65 70 75 80 Arg Gly Arg Pro Phe Ala Asp Val Leu Ser Leu Ser Asp Gly Pro Pro 85 90 95 Gly Ala Gly Ser Gly Val Pro Tyr Phe Tyr Leu Ser Pro Leu Gln Leu 100 105 110 Ser Val Ser Asn Leu Gln Glu Asn Pro Tyr Ala Thr Leu Thr Met Thr 115 120 125 Leu Ala Gln Thr Asn Phe Cys Lys Lys His Gly Phe Asp Pro Gln Ser 130 135 140 Pro Leu Cys Val His Ile Met Leu Ser Gly Thr Val Thr Lys Val Asn 145 150 155 160 Glu Thr Glu Met Asp Ile Ala Lys His Ser Leu Phe Ile Arg His Pro 165 170 175 Glu Met Lys Thr Trp Pro Ser Ser His Asn Trp Phe Phe Ala Lys Leu 180 185 190 Asn Ile Thr Asn Ile Trp Val Leu Asp Tyr Phe Gly Gly Pro Lys Ile 195 200 205 Val Thr Pro Glu Glu Tyr Tyr Asn Val Thr Val Gln 210 215 220 95 425 DNA Homo sapiens CDS (1)...(390) 95 gcc cgg cag tct gaa gac cac cct cat cgc cgg cgt cgg cgg ggc ttg 48 Ala Arg Gln Ser Glu Asp His Pro His Arg Arg Arg Arg Arg Gly Leu 1 5 10 15 gag tgt gat ggc aag gtc aac atc tgc tgt aag aaa cag ttc ttt gtc 96 Glu Cys Asp Gly Lys Val Asn Ile Cys Cys Lys Lys Gln Phe Phe Val 20 25 30 agt ttc aag gac atc ggc tgg aat gac tgg atc att gct ccc tct ggc 144 Ser Phe Lys Asp Ile Gly Trp Asn Asp Trp Ile Ile Ala Pro Ser Gly 35 40 45 tat cat gcc aac tac tgc gag ggt gag tgc ccg agc cat ata gca ggc 192 Tyr His Ala Asn Tyr Cys Glu Gly Glu Cys Pro Ser His Ile Ala Gly 50 55 60 acg tcc ggg tcc tca ctg tcc ttc cac tca aca gtc atc aac cac tac 240 Thr Ser Gly Ser Ser Leu Ser Phe His Ser Thr Val Ile Asn His Tyr 65 70 75 80 cgc atg cgg ggc cat agc ccc ttt gcc aac ctc aaa tcg tgc tgt gtg 288 Arg Met Arg Gly His Ser Pro Phe Ala Asn Leu Lys Ser Cys Cys Val 85 90 95 ccc acc aag ctg aga ccc atg tcc atg ttg tac tat gat gat ggt caa 336 Pro Thr Lys Leu Arg Pro Met Ser Met Leu Tyr Tyr Asp Asp Gly Gln 100 105 110 aac atc atc aaa aag gac att cag aac atg atc gtg gag gag tgt ggg 384 Asn Ile Ile Lys Lys Asp Ile Gln Asn Met Ile Val Glu Glu Cys Gly 115 120 125 tgc tca tagagttgcc cagcccaggg ggaaagggag caaga 425 Cys Ser 130 96 130 PRT Homo sapiens 96 Ala Arg Gln Ser Glu Asp His Pro His Arg Arg Arg Arg Arg Gly Leu 1 5 10 15 Glu Cys Asp Gly Lys Val Asn Ile Cys Cys Lys Lys Gln Phe Phe Val 20 25 30 Ser Phe Lys Asp Ile Gly Trp Asn Asp Trp Ile Ile Ala Pro Ser Gly 35 40 45 Tyr His Ala Asn Tyr Cys Glu Gly Glu Cys Pro Ser His Ile Ala Gly 50 55 60 Thr Ser Gly Ser Ser Leu Ser Phe His Ser Thr Val Ile Asn His Tyr 65 70 75 80 Arg Met Arg Gly His Ser Pro Phe Ala Asn Leu Lys Ser Cys Cys Val 85 90 95 Pro Thr Lys Leu Arg Pro Met Ser Met Leu Tyr Tyr Asp Asp Gly Gln 100 105 110 Asn Ile Ile Lys Lys Asp Ile Gln Asn Met Ile Val Glu Glu Cys Gly 115 120 125 Cys Ser 130 97 864 PRT Homo sapiens 97 Met Gly Asn Arg Gly Met Glu Asp Leu Ile Pro Leu Val Asn Arg Leu 1 5 10 15 Gln Asp Ala Phe Ser Ala Ile Gly Gln Asn Ala Asp Leu Asp Leu Pro 20 25 30 Gln Ile Ala Val Val Gly Gly Gln Ser Ala Gly Lys Ser Ser Val Leu 35 40 45 Glu Asn Phe Val Gly Arg Asp Phe Leu Pro Arg Gly Ser Gly Ile Val 50 55 60 Thr Arg Arg Pro Leu Val Leu Gln Leu Val Asn Ala Thr Thr Glu Tyr 65 70 75 80 Ala Glu Phe Leu His Cys Lys Gly Lys Lys Phe Thr Asp Phe Glu Glu 85 90 95 Val Arg Leu Glu Ile Glu Ala Glu Thr Asp Arg Val Thr Gly Thr Asn 100 105 110 Lys Gly Ile Ser Pro Val Pro Ile Asn Leu Arg Val Tyr Ser Pro His 115 120 125 Val Leu Asn Leu Thr Leu Val Asp Leu Pro Gly Met Thr Lys Val Pro 130 135 140 Val Gly Asp Gln Pro Pro Asp Ile Glu Phe Gln Ile Arg Asp Met Leu 145 150 155 160 Met Gln Phe Val Thr Lys Glu Asn Cys Leu Ile Leu Ala Val Ser Pro 165 170 175 Ala Asn Ser Asp Leu Ala Asn Ser Asp Ala Leu Lys Val Ala Lys Glu 180 185 190 Val Asp Pro Gln Gly Gln Arg Thr Ile Gly Val Ile Thr Lys Leu Asp 195 200 205 Leu Met Asp Glu Gly Thr Asp Ala Arg Asp Val Leu Glu Asn Lys Leu 210 215 220 Leu Pro Leu Arg Arg Gly Tyr Ile Gly Val Val Asn Arg Ser Gln Lys 225 230 235 240 Asp Ile Asp Gly Lys Lys Asp Ile Thr Ala Ala Leu Ala Ala Glu Arg 245 250 255 Lys Phe Phe Leu Ser His Pro Ser Tyr Arg His Leu Ala Asp Arg Met 260 265 270 Gly Thr Pro Tyr Leu Gln Lys Val Leu Asn Gln Gln Leu Thr Asn His 275 280 285 Ile Arg Asp Thr Leu Pro Gly Leu Arg Asn Lys Leu Gln Ser Gln Leu 290 295 300 Leu Ser Ile Glu Lys Glu Val Glu Glu Tyr Lys Asn Phe Arg Pro Asp 305 310 315 320 Asp Pro Ala Arg Lys Thr Lys Ala Leu Leu Gln Met Val Gln Gln Phe 325 330 335 Ala Val Asp Phe Glu Lys Arg Ile Glu Gly Ser Gly Asp Gln Ile Asp 340 345 350 Thr Tyr Glu Leu Ser Gly Gly Ala Arg Ile Asn Arg Ile Phe His Glu 355 360 365 Arg Phe Pro Phe Glu Leu Val Lys Met Glu Phe Asp Glu Lys Glu Leu 370 375 380 Arg Arg Glu Ile Ser Tyr Ala Ile Lys Asn Ile His Gly Ile Arg Thr 385 390 395 400 Gly Leu Phe Thr Pro Asp Leu Ala Phe Glu Ala Thr Val Lys Lys Gln 405 410 415 Val Gln Lys Leu Lys Glu Pro Ser Ile Lys Cys Val Asp Met Val Val 420 425 430 Ser Glu Leu Thr Ala Thr Ile Arg Lys Cys Ser Glu Lys Leu Gln Gln 435 440 445 Tyr Pro Arg Leu Arg Glu Glu Met Glu Arg Ile Val Thr Thr His Ile 450 455 460 Arg Glu Arg Glu Gly Arg Thr Lys Glu Gln Val Met Leu Leu Ile Asp 465 470 475 480 Ile Glu Leu Ala Tyr Met Asn Thr Asn His Glu Asp Phe Ile Gly Phe 485 490 495 Ala Asn Ala Gln Gln Arg Ser Asn Gln Met Asn Lys Lys Lys Thr Ser 500 505 510 Gly Asn Gln Asp Glu Ile Leu Val Ile Arg Lys Gly Trp Leu Thr Ile 515 520 525 Asn Asn Ile Gly Ile Met Lys Gly Gly Ser Lys Glu Tyr Trp Phe Val 530 535 540 Leu Thr Ala Glu Asn Leu Ser Trp Tyr Lys Asp Asp Glu Glu Lys Glu 545 550 555 560 Lys Lys Tyr Met Leu Ser Val Asp Asn Leu Lys Leu Arg Asp Val Glu 565 570 575 Lys Gly Phe Met Ser Ser Lys His Ile Phe Ala Leu Phe Asn Thr Glu 580 585 590 Gln Arg Asn Val Tyr Lys Asp Tyr Arg Gln Leu Glu Leu Ala Cys Glu 595 600 605 Thr Gln Glu Glu Val Asp Ser Trp Lys Ala Ser Phe Leu Arg Ala Gly 610 615 620 Val Tyr Pro Glu Arg Val Gly Asp Lys Glu Lys Ala Ser Glu Thr Glu 625 630 635 640 Glu Asn Gly Ser Asp Ser Phe Met His Ser Met Asp Pro Gln Leu Glu 645 650 655 Arg Gln Val Glu Thr Ile Arg Asn Leu Val Asp Ser Tyr Met Ala Ile 660 665 670 Val Asn Lys Thr Val Arg Asp Leu Met Pro Lys Thr Ile Met His Leu 675 680 685 Met Ile Asn Asn Thr Lys Glu Phe Ile Phe Ser Glu Leu Leu Ala Asn 690 695 700 Leu Tyr Ser Cys Gly Asp Gln Asn Thr Leu Met Glu Glu Ser Ala Glu 705 710 715 720 Gln Ala Gln Arg Arg Asp Glu Met Leu Arg Met Tyr His Ala Leu Lys 725 730 735 Glu Ala Leu Ser Ile Ile Gly Asn Ile Asn Thr Thr Thr Val Ser Thr 740 745 750 Pro Met Pro Pro Pro Val Asp Asp Ser Trp Leu Gln Val Gln Ser Val 755 760 765 Pro Ala Gly Arg Arg Ser Pro Thr Ser Ser Pro Thr Pro Gln Arg Arg 770 775 780 Ala Pro Ala Val Pro Pro Ala Arg Pro Gly Ser Arg Gly Pro Ala Pro 785 790 795 800 Gly Pro Pro Pro Ala Gly Ser Ala Leu Gly Gly Ala Pro Pro Val Pro 805 810 815 Ser Arg Pro Gly Ala Ser Pro Asp Pro Phe Gly Pro Pro Pro Gln Val 820 825 830 Pro Ser Arg Pro Asn Arg Ala Pro Pro Gly Val Pro Ser Arg Ser Gly 835 840 845 Gln Ala Ser Pro Ser Arg Pro Glu Ser Pro Arg Pro Pro Phe Asp Leu 850 855 860 98 98 000 99 1940 PRT Homo sapiens 99 Met Ser Ser Asp Thr Glu Met Glu Val Phe Gly Ile Ala Ala Pro Phe 1 5 10 15 Leu Arg Lys Ser Glu Lys Glu Arg Ile Glu Ala Gln Asn Gln Pro Phe 20 25 30 Asp Ala Lys Thr Tyr Cys Phe Val Val Asp Ser Lys Glu Glu Tyr Ala 35 40 45 Lys Gly Lys Ile Lys Ser Ser Gln Asp Gly Lys Val Thr Val Glu Thr 50 55 60 Glu Asp Asn Arg Thr Leu Val Val Lys Pro Glu Asp Val Tyr Ala Met 65 70 75 80 Asn Pro Pro Lys Phe Asp Arg Ile Glu Asp Met Ala Met Leu Thr His 85 90 95 Leu Asn Glu Pro Ala Val Leu Tyr Asn Leu Lys Asp Arg Tyr Thr Ser 100 105 110 Trp Met Ile Tyr Thr Tyr Ser Gly Leu Phe Cys Val Thr Val Asn Pro 115 120 125 Tyr Lys Trp Leu Pro Val Tyr Asn Pro Glu Val Val Glu Gly Tyr Arg 130 135 140 Gly Lys Lys Arg Gln Glu Ala Pro Pro His Ile Phe Ser Ile Ser Asp 145 150 155 160 Asn Ala Tyr Gln Phe Met Leu Thr Asp Arg Glu Asn Gln Ser Ile Leu 165 170 175 Ile Thr Gly Glu Ser Gly Ala Gly Lys Thr Val Asn Thr Lys Arg Val 180 185 190 Ile Gln Tyr Phe Ala Thr Ile Ala Ala Thr Gly Asp Leu Ala Lys Lys 195 200 205 Lys Asp Ser Lys Met Lys Gly Thr Leu Glu Asp Gln Ile Ile Ser Ala 210 215 220 Asn Pro Leu Leu Glu Ala Phe Gly Asn Ala Lys Thr Val Arg Asn Asp 225 230 235 240 Asn Ser Ser Arg Phe Gly Lys Phe Ile Arg Ile His Phe Gly Thr Thr 245 250 255 Gly Lys Leu Ala Ser Ala Asp Ile Glu Thr Tyr Leu Leu Glu Lys Ser 260 265 270 Arg Val Thr Phe Gln Leu Lys Ala Glu Arg Ser Tyr His Ile Phe Tyr 275 280 285 Gln Ile Leu Ser Asn Lys Lys Pro Glu Leu Ile Glu Leu Leu Leu Ile 290 295 300 Thr Thr Asn Pro Tyr Asp Tyr Pro Phe Ile Ser Gln Gly Glu Ile Leu 305 310 315 320 Val Ala Ser Ile Asp Asp Arg Glu Glu Leu Leu Ala Thr Asp Ser Ala 325 330 335 Ile Asp Ile Leu Gly Phe Thr Pro Glu Glu Lys Ser Gly Leu Tyr Lys 340 345 350 Leu Thr Gly Ala Val Met His Tyr Gly Asn Met Lys Phe Lys Gln Lys 355 360 365 Gln Arg Glu Glu Gln Ala Glu Pro Asp Gly Thr Glu Val Ala Asp Lys 370 375 380 Thr Ala Tyr Leu Met Gly Leu Asn Ser Ser Asp Leu Leu Lys Ala Leu 385 390 395 400 Cys Phe Pro Arg Val Lys Val Gly Asn Glu Tyr Val Thr Lys Gly Gln 405 410 415 Thr Val Asp Gln Val His His Ala Val Asn Ala Leu Ser Lys Ser Val 420 425 430 Tyr Glu Lys Leu Phe Leu Trp Met Val Thr Arg Ile Asn Gln Gln Leu 435 440 445 Asp Thr Lys Leu Pro Arg Gln His Phe Ile Gly Val Leu Asp Ile Ala 450 455 460 Gly Phe Glu Ile Phe Glu Tyr Asn Ser Leu Glu Gln Leu Cys Ile Asn 465 470 475 480 Phe Thr Asn Glu Lys Leu Gln Gln Phe Phe Asn His His Met Phe Val 485 490 495 Leu Glu Gln Glu Glu Tyr Lys Lys Glu Gly Ile Glu Trp Thr Phe Ile 500 505 510 Asp Phe Gly Met Asp Leu Ala Ala Cys Ile Glu Leu Ile Glu Lys Pro 515 520 525 Met Gly Ile Phe Ser Ile Leu Glu Glu Glu Cys Met Phe Pro Lys Ala 530 535 540 Thr Asp Thr Ser Phe Lys Asn Lys Leu Tyr Asp Gln His Leu Gly Lys 545 550 555 560 Ser Asn Asn Phe Gln Lys Pro Lys Val Val Lys Gly Arg Ala Glu Ala 565 570 575 His Phe Ser Leu Ile His Tyr Ala Gly Thr Val Asp Tyr Ser Val Ser 580 585 590 Gly Trp Leu Glu Lys Asn Lys Asp Pro Leu Asn Glu Thr Val Val Gly 595 600 605 Leu Tyr Gln Lys Ser Ser Asn Arg Leu Leu Ala His Leu Tyr Ala Thr 610 615 620 Phe Ala Thr Ala Asp Ala Asp Ser Gly Lys Lys Lys Val Ala Lys Lys 625 630 635 640 Lys Gly Ser Ser Phe Gln Thr Val Ser Ala Leu Phe Arg Glu Asn Leu 645 650 655 Asn Lys Leu Met Ser Asn Leu Arg Thr Thr His Pro His Phe Val Arg 660 665 670 Cys Ile Ile Pro Asn Glu Thr Lys Thr Pro Gly Ala Met Glu His Ser 675 680 685 Leu Val Leu His Gln Leu Arg Cys Asn Gly Val Leu Glu Gly Ile Arg 690 695 700 Ile Cys Arg Lys Gly Phe Pro Asn Arg Ile Leu Tyr Gly Asp Phe Lys 705 710 715 720 Gln Arg Tyr Arg Val Leu Asn Ala Ser Ala Ile Leu Glu Gly Gln Phe 725 730 735 Ile Asp Ser Lys Lys Ala Cys Glu Lys Leu Leu Ala Ser Ile Asp Ile 740 745 750 Asp His Thr Gln Tyr Lys Phe Gly His Thr Lys Val Phe Phe Lys Ala 755 760 765 Gly Leu Leu Gly Thr Leu Glu Glu Met Arg Asp Asp Arg Leu Ala Lys 770 775 780 Leu Ile Thr Arg Thr Gln Ala Val Cys Arg Gly Phe Leu Met Arg Val 785 790 795 800 Glu Phe Gln Lys Met Val Gln Arg Arg Glu Ser Ile Phe Cys Ile Gln 805 810 815 Tyr Asn Ile Arg Ser Phe Met Asn Val Lys His Trp Pro Trp Met Lys 820 825 830 Leu Phe Phe Lys Ile Lys Pro Leu Leu Lys Ser Ala Glu Thr Glu Lys 835 840 845 Glu Met Ala Thr Met Lys Glu Glu Phe Gln Lys Thr Lys Asp Glu Leu 850 855 860 Ala Lys Ser Glu Ala Lys Arg Lys Glu Leu Glu Glu Lys Leu Val Thr 865 870 875 880 Leu Val Gln Glu Lys Asn Asp Leu Gln Leu Gln Val Gln Ala Glu Ser 885 890 895 Glu Asn Leu Leu Asp Ala Glu Glu Arg Cys Asp Gln Leu Ile Lys Ala 900 905 910 Lys Phe Gln Leu Glu Ala Lys Ile Lys Glu Val Thr Glu Arg Ala Glu 915 920 925 Asp Glu Glu Glu Ile Asn Ala Glu Leu Thr Ala Lys Lys Arg Lys Leu 930 935 940 Glu Asp Glu Cys Ser Glu Leu Lys Lys Asp Ile Asp Asp Leu Glu Leu 945 950 955 960 Thr Leu Ala Lys Val Glu Lys Glu Lys His Ala Thr Glu Asn Lys Val 965 970 975 Lys Asn Leu Thr Glu Glu Leu Ser Gly Leu Asp Glu Thr Ile Ala Lys 980 985 990 Leu Thr Arg Glu Lys Lys Ala Leu Gln Glu Ala His Gln Gln Ala Leu 995 1000 1005 Asp Asp Leu Gln Ala Glu Glu Asp Lys Val Asn Ser Leu Asn Lys Thr 1010 1015 1020 Lys Ser Lys Leu Glu Gln Gln Val Glu Asp Leu Glu Ser Ser Leu Glu 1025 1030 1035 1040 Gln Glu Lys Lys Leu Arg Val Asp Leu Glu Arg Asn Lys Arg Lys Leu 1045 1050 1055 Glu Gly Asp Leu Lys Leu Ala Gln Glu Ser Ile Leu Asp Leu Glu Asn 1060 1065 1070 Asp Lys Gln Gln Leu Asp Glu Arg Leu Lys Lys Lys Asp Phe Glu Tyr 1075 1080 1085 Cys Gln Leu Gln Ser Lys Val Glu Asp Glu Gln Thr Leu Gly Leu Gln 1090 1095 1100 Phe Gln Lys Lys Ile Lys Glu Leu Gln Ala Arg Ile Glu Glu Leu Glu 1105 1110 1115 1120 Glu Glu Ile Glu Ala Glu Arg Ala Thr Arg Ala Lys Thr Glu Lys Gln 1125 1130 1135 Arg Ser Asp Tyr Ala Arg Glu Leu Glu Glu Leu Ser Glu Arg Leu Glu 1140 1145 1150 Glu Ala Gly Gly Val Thr Ser Thr Gln Ile Glu Leu Asn Lys Lys Arg 1155 1160 1165 Glu Ala Glu Phe Leu Lys Leu Arg Arg Asp Leu Glu Glu Ala Thr Leu 1170 1175 1180 Gln His Glu Ala Met Val Ala Thr Leu Arg Lys Lys His Ala Asp Ser 1185 1190 1195 1200 Val Ala Glu Leu Gly Glu Gln Ile Asp Asn Leu Gln Arg Val Lys Gln 1205 1210 1215 Lys Leu Glu Lys Glu Lys Ser Glu Phe Lys Leu Glu Ile Asp Asp Leu 1220 1225 1230 Ser Ser Ser Met Glu Ser Val Ser Lys Ser Lys Ala Asn Leu Glu Lys 1235 1240 1245 Ile Cys Arg Thr Leu Glu Asp Gln Leu Ser Glu Ala Arg Gly Lys Asn 1250 1255 1260 Glu Glu Ile Gln Arg Ser Leu Ser Glu Leu Thr Thr Gln Lys Ser Arg 1265 1270 1275 1280 Leu Gln Thr Glu Ala Gly Glu Leu Ser Arg Gln Leu Glu Glu Lys Glu 1285 1290 1295 Ser Ile Val Ser Gln Leu Ser Arg Ser Lys Gln Ala Phe Thr Gln Gln 1300 1305 1310 Thr Glu Glu Leu Lys Arg Gln Leu Glu Glu Glu Asn Lys Ala Lys Asn 1315 1320 1325 Ala Leu Ala His Ala Leu Gln Ser Ser Arg His Asp Cys Asp Leu Leu 1330 1335 1340 Arg Glu Gln Tyr Glu Glu Glu Gln Glu Gly Lys Ala Glu Leu Gln Arg 1345 1350 1355 1360 Ala Leu Ser Lys Ala Asn Ser Glu Val Ala Gln Trp Arg Thr Lys Tyr 1365 1370 1375 Glu Thr Asp Ala Ile Gln Arg Thr Glu Glu Leu Glu Glu Ala Lys Lys 1380 1385 1390 Lys Leu Ala Gln Arg Leu Gln Asp Ser Glu Glu Gln Val Glu Ala Val 1395 1400 1405 Asn Ala Lys Cys Ala Ser Leu Glu Lys Thr Lys Gln Arg Leu Gln Gly 1410 1415 1420 Glu Val Glu Asp Leu Met Val Asp Val Glu Arg Ala Asn Ser Leu Ala 1425 1430 1435 1440 Ala Ala Leu Asp Lys Lys Gln Arg Asn Phe Asp Lys Val Leu Ala Glu 1445 1450 1455 Trp Lys Thr Lys Cys Glu Glu Ser Gln Ala Glu Leu Glu Ala Ser Leu 1460 1465 1470 Lys Glu Ser Arg Ser Leu Ser Thr Glu Leu Phe Lys Leu Lys Asn Ala 1475 1480 1485 Tyr Glu Glu Ala Leu Asp Gln Leu Glu Thr Val Lys Arg Glu Asn Lys 1490 1495 1500 Asn Leu Glu Gln Glu Ile Ala Asp Leu Thr Glu Gln Ile Ala Glu Asn 1505 1510 1515 1520 Gly Lys Thr Ile His Glu Leu Glu Lys Ser Arg Lys Gln Ile Glu Leu 1525 1530 1535 Glu Lys Ala Asp Ile Gln Leu Ala Leu Glu Glu Ala Glu Ala Ala Leu 1540 1545 1550 Glu His Glu Glu Ala Lys Ile Leu Arg Ile Gln Leu Glu Leu Thr Gln 1555 1560 1565 Val Lys Ser Glu Ile Asp Arg Lys Ile Ala Glu Lys Asp Glu Glu Ile 1570 1575 1580 Glu Gln Leu Lys Arg Asn Tyr Gln Arg Thr Val Glu Thr Met Gln Ser 1585 1590 1595 1600 Ala Leu Asp Ala Glu Val Arg Ser Arg Asn Glu Ala Ile Arg Leu Lys 1605 1610 1615 Lys Lys Met Glu Gly Asp Leu Asn Glu Ile Glu Ile Gln Leu Ser His 1620 1625 1630 Ala Asn Arg Gln Ala Ala Glu Thr Leu Lys His Leu Arg Ser Val Gln 1635 1640 1645 Gly Gln Leu Lys Asp Thr Gln Leu His Leu Asp Asp Ala Leu Arg Gly 1650 1655 1660 Gln Glu Asp Leu Lys Glu Gln Leu Ala Ile Val Glu Arg Arg Ala Asn 1665 1670 1675 1680 Leu Leu Gln Ala Glu Val Glu Glu Leu Arg Ala Thr Leu Glu Gln Thr 1685 1690 1695 Glu Arg Ala Arg Lys Leu Ala Glu Gln Glu Leu Leu Asp Ser Asn Glu 1700 1705 1710 Arg Val Gln Leu Leu His Thr Gln Asn Thr Ser Leu Ile His Thr Lys 1715 1720 1725 Lys Lys Leu Glu Thr Asp Leu Met Gln Leu Gln Ser Glu Val Glu Asp 1730 1735 1740 Ala Ser Arg Asp Ala Arg Asn Ala Glu Glu Lys Ala Lys Lys Ala Ile 1745 1750 1755 1760 Thr Asp Ala Ala Met Met Ala Glu Glu Leu Lys Lys Glu Gln Asp Thr 1765 1770 1775 Ser Ala His Leu Glu Arg Met Lys Lys Asn Leu Glu Gln Thr Val Lys 1780 1785 1790 Asp Leu Gln His Arg Leu Asp Glu Ala Glu Gln Leu Ala Leu Lys Gly 1795 1800 1805 Gly Lys Lys Gln Ile Gln Lys Leu Glu Thr Arg Ile Arg Glu Leu Glu 1810 1815 1820 Phe Glu Leu Glu Gly Glu Gln Lys Lys Asn Thr Glu Ser Val Lys Gly 1825 1830 1835 1840 Leu Arg Lys Tyr Glu Arg Arg Val Lys Glu Leu Thr Tyr Gln Ser Glu 1845 1850 1855 Glu Asp Arg Lys Asn Val Leu Arg Leu Gln Asp Leu Val Asp Lys Leu 1860 1865 1870 Gln Val Lys Val Lys Ser Tyr Lys Arg Gln Ala Glu Glu Ala Asp Glu 1875 1880 1885 Gln Ala Asn Ala His Leu Thr Lys Phe Arg Lys Ala Gln His Glu Leu 1890 1895 1900 Glu Glu Ala Glu Glu Arg Ala Asp Ile Ala Glu Ser Gln Val Asn Lys 1905 1910 1915 1920 Leu Arg Ala Lys Thr Arg Asp Phe Thr Ser Ser Arg Met Val Val His 1925 1930 1935 Glu Ser Glu Glu 1940 100 1883 DNA Homo sapiens CDS (59)...(1720) 100 gtcccagtca gtccggaggc tgcggctgca gaagtaccgc tgcggagtaa ctgcaaag 58 atg ctg tcc gtg cgc gtt gct gcg gcc gtg gtc cgc gcc ctt cct cgg 106 Met Leu Ser Val Arg Val Ala Ala Ala Val Val Arg Ala Leu Pro Arg 1 5 10 15 cgg gcc gga ctg gtc tcc aga aat gct ttg ggt tca tct ttc att gct 154 Arg Ala Gly Leu Val Ser Arg Asn Ala Leu Gly Ser Ser Phe Ile Ala 20 25 30 gca agg aac ttc cat gcc tct aac act cat ctt caa aag act ggg act 202 Ala Arg Asn Phe His Ala Ser Asn Thr His Leu Gln Lys Thr Gly Thr 35 40 45 gct gag atg tcc tct att ctt gaa gag cgt att ctt gga gct gat acc 250 Ala Glu Met Ser Ser Ile Leu Glu Glu Arg Ile Leu Gly Ala Asp Thr 50 55 60 tct gtt gat ctt gaa gaa act ggg cgt gtc tta agt att ggt gat ggt 298 Ser Val Asp Leu Glu Glu Thr Gly Arg Val Leu Ser Ile Gly Asp Gly 65 70 75 80 att gcc cgc gta cat ggg ctg agg aat gtt caa gca gaa gaa atg gta 346 Ile Ala Arg Val His Gly Leu Arg Asn Val Gln Ala Glu Glu Met Val 85 90 95 gag ttt tct tca ggc tta aag ggt atg tcc ttg aac ttg gaa cct gac 394 Glu Phe Ser Ser Gly Leu Lys Gly Met Ser Leu Asn Leu Glu Pro Asp 100 105 110 aat gtt ggt gtt gtc gtg ttt gga aat gat aaa cta att aag gaa gga 442 Asn Val Gly Val Val Val Phe Gly Asn Asp Lys Leu Ile Lys Glu Gly 115 120 125 gat ata gtg aag agg aca gga gcc att gtg gac gtt cca gtt ggt gag 490 Asp Ile Val Lys Arg Thr Gly Ala Ile Val Asp Val Pro Val Gly Glu 130 135 140 gag ctg ttg ggt cgt gta gtt gat gcc ctt ggt aat gct att gat gga 538 Glu Leu Leu Gly Arg Val Val Asp Ala Leu Gly Asn Ala Ile Asp Gly 145 150 155 160 aag ggt cca att ggt tcc aag acg cgt agg cga gtt ggt ctg aaa gcc 586 Lys Gly Pro Ile Gly Ser Lys Thr Arg Arg Arg Val Gly Leu Lys Ala 165 170 175 ccc ggt atc att cct cga att tca gtg cgg gaa cca atg cag act ggc 634 Pro Gly Ile Ile Pro Arg Ile Ser Val Arg Glu Pro Met Gln Thr Gly 180 185 190 att aag gct gtg gat agc ttg gtg cca att ggt cgt ggt cag cgt gaa 682 Ile Lys Ala Val Asp Ser Leu Val Pro Ile Gly Arg Gly Gln Arg Glu 195 200 205 ctg att att ggt gac cga cag act ggg aaa acc tca att gct att gac 730 Leu Ile Ile Gly Asp Arg Gln Thr Gly Lys Thr Ser Ile Ala Ile Asp 210 215 220 aca atc att aac cag aaa cgt ttc aat gat gga tct gat gaa aag aag 778 Thr Ile Ile Asn Gln Lys Arg Phe Asn Asp Gly Ser Asp Glu Lys Lys 225 230 235 240 aag ctg tac tgt att tat gtt gct att ggt caa aag aga tcc act gtt 826 Lys Leu Tyr Cys Ile Tyr Val Ala Ile Gly Gln Lys Arg Ser Thr Val 245 250 255 gcc cag ttg gtg aag aga ctt aca gat gca gat gcc atg aag tac acc 874 Ala Gln Leu Val Lys Arg Leu Thr Asp Ala Asp Ala Met Lys Tyr Thr 260 265 270 att gtg gtg tcg gct acg gcc tcg gat gct gcc cca ctt cag tac ctg 922 Ile Val Val Ser Ala Thr Ala Ser Asp Ala Ala Pro Leu Gln Tyr Leu 275 280 285 gct cct tac tct ggc tgt tcc atg gga gag tat ttt aga gac aat ggc 970 Ala Pro Tyr Ser Gly Cys Ser Met Gly Glu Tyr Phe Arg Asp Asn Gly 290 295 300 aaa cat gct ttg atc atc tat gac gac tta tcc aaa cag gct gtt gct 1018 Lys His Ala Leu Ile Ile Tyr Asp Asp Leu Ser Lys Gln Ala Val Ala 305 310 315 320 tac cgt cag atg tct ctg ttg ctc cgc cga ccc cct ggt cgt gag gcc 1066 Tyr Arg Gln Met Ser Leu Leu Leu Arg Arg Pro Pro Gly Arg Glu Ala 325 330 335 tat cct ggt gat gtg ttc tac cta cac tcc cgg ttg ctg gag aga gca 1114 Tyr Pro Gly Asp Val Phe Tyr Leu His Ser Arg Leu Leu Glu Arg Ala 340 345 350 gcc aaa atg aac gat gct ttt ggt ggt ggc tcc ttg act gct ttg cca 1162 Ala Lys Met Asn Asp Ala Phe Gly Gly Gly Ser Leu Thr Ala Leu Pro 355 360 365 gtc ata gaa aca cag gct ggt gat gtg tct gct tac att cca aca aat 1210 Val Ile Glu Thr Gln Ala Gly Asp Val Ser Ala Tyr Ile Pro Thr Asn 370 375 380 gtc att tcc atc act gac gga cag atc ttc ttg gaa aca gaa ttg ttc 1258 Val Ile Ser Ile Thr Asp Gly Gln Ile Phe Leu Glu Thr Glu Leu Phe 385 390 395 400 tac aaa ggt atc cgc cct gca att aac gtt ggt ctg tct gta tct cgt 1306 Tyr Lys Gly Ile Arg Pro Ala Ile Asn Val Gly Leu Ser Val Ser Arg 405 410 415 gtc gga tcc gct gcc caa acc agg gct atg aag cag gta gca ggt acc 1354 Val Gly Ser Ala Ala Gln Thr Arg Ala Met Lys Gln Val Ala Gly Thr 420 425 430 atg aag ctg gaa ttg gct cag tat cgt gag gtt gct gct ttt gcc cag 1402 Met Lys Leu Glu Leu Ala Gln Tyr Arg Glu Val Ala Ala Phe Ala Gln 435 440 445 ttc ggt tct gac ctc gat gct gcc act caa caa ctt ttg agt cgt ggc 1450 Phe Gly Ser Asp Leu Asp Ala Ala Thr Gln Gln Leu Leu Ser Arg Gly 450 455 460 gtg cgt cta act gag ttg ctg aag caa gga cag tat tct ccc atg gct 1498 Val Arg Leu Thr Glu Leu Leu Lys Gln Gly Gln Tyr Ser Pro Met Ala 465 470 475 480 att gaa gaa caa gtg gct gtt atc tat gcg ggt gta agg gga tat ctt 1546 Ile Glu Glu Gln Val Ala Val Ile Tyr Ala Gly Val Arg Gly Tyr Leu 485 490 495 gat aaa ctg gag ccc agc aag att aca aag ttt gag aat gct ttc ttg 1594 Asp Lys Leu Glu Pro Ser Lys Ile Thr Lys Phe Glu Asn Ala Phe Leu 500 505 510 tct cat gtc gtc agc cag cac caa gcc ttg ttg ggc act atc agg gct 1642 Ser His Val Val Ser Gln His Gln Ala Leu Leu Gly Thr Ile Arg Ala 515 520 525 gat gga aag atc tca gaa caa tca gat gca aag ctg aaa gag att gta 1690 Asp Gly Lys Ile Ser Glu Gln Ser Asp Ala Lys Leu Lys Glu Ile Val 530 535 540 aca aat ttc ttg gct gga ttt gaa gct taa actcctgtgg attcacatca 1740 Thr Asn Phe Leu Ala Gly Phe Glu Ala * 545 550 aataccagtt cagttttgtc attgttctag taaattagtt ccatttgtaa aagggttact 1800 ctcatactcc ttatgtacag aaatcacatg aaaaataaag gttccataat gcaaaaaaaa 1860 aaaaaaaaaa aaaaaaaaaa aaa 1883 101 553 PRT Homo sapiens 101 Met Leu Ser Val Arg Val Ala Ala Ala Val Val Arg Ala Leu Pro Arg 1 5 10 15 Arg Ala Gly Leu Val Ser Arg Asn Ala Leu Gly Ser Ser Phe Ile Ala 20 25 30 Ala Arg Asn Phe His Ala Ser Asn Thr His Leu Gln Lys Thr Gly Thr 35 40 45 Ala Glu Met Ser Ser Ile Leu Glu Glu Arg Ile Leu Gly Ala Asp Thr 50 55 60 Ser Val Asp Leu Glu Glu Thr Gly Arg Val Leu Ser Ile Gly Asp Gly 65 70 75 80 Ile Ala Arg Val His Gly Leu Arg Asn Val Gln Ala Glu Glu Met Val 85 90 95 Glu Phe Ser Ser Gly Leu Lys Gly Met Ser Leu Asn Leu Glu Pro Asp 100 105 110 Asn Val Gly Val Val Val Phe Gly Asn Asp Lys Leu Ile Lys Glu Gly 115 120 125 Asp Ile Val Lys Arg Thr Gly Ala Ile Val Asp Val Pro Val Gly Glu 130 135 140 Glu Leu Leu Gly Arg Val Val Asp Ala Leu Gly Asn Ala Ile Asp Gly 145 150 155 160 Lys Gly Pro Ile Gly Ser Lys Thr Arg Arg Arg Val Gly Leu Lys Ala 165 170 175 Pro Gly Ile Ile Pro Arg Ile Ser Val Arg Glu Pro Met Gln Thr Gly 180 185 190 Ile Lys Ala Val Asp Ser Leu Val Pro Ile Gly Arg Gly Gln Arg Glu 195 200 205 Leu Ile Ile Gly Asp Arg Gln Thr Gly Lys Thr Ser Ile Ala Ile Asp 210 215 220 Thr Ile Ile Asn Gln Lys Arg Phe Asn Asp Gly Ser Asp Glu Lys Lys 225 230 235 240 Lys Leu Tyr Cys Ile Tyr Val Ala Ile Gly Gln Lys Arg Ser Thr Val 245 250 255 Ala Gln Leu Val Lys Arg Leu Thr Asp Ala Asp Ala Met Lys Tyr Thr 260 265 270 Ile Val Val Ser Ala Thr Ala Ser Asp Ala Ala Pro Leu Gln Tyr Leu 275 280 285 Ala Pro Tyr Ser Gly Cys Ser Met Gly Glu Tyr Phe Arg Asp Asn Gly 290 295 300 Lys His Ala Leu Ile Ile Tyr Asp Asp Leu Ser Lys Gln Ala Val Ala 305 310 315 320 Tyr Arg Gln Met Ser Leu Leu Leu Arg Arg Pro Pro Gly Arg Glu Ala 325 330 335 Tyr Pro Gly Asp Val Phe Tyr Leu His Ser Arg Leu Leu Glu Arg Ala 340 345 350 Ala Lys Met Asn Asp Ala Phe Gly Gly Gly Ser Leu Thr Ala Leu Pro 355 360 365 Val Ile Glu Thr Gln Ala Gly Asp Val Ser Ala Tyr Ile Pro Thr Asn 370 375 380 Val Ile Ser Ile Thr Asp Gly Gln Ile Phe Leu Glu Thr Glu Leu Phe 385 390 395 400 Tyr Lys Gly Ile Arg Pro Ala Ile Asn Val Gly Leu Ser Val Ser Arg 405 410 415 Val Gly Ser Ala Ala Gln Thr Arg Ala Met Lys Gln Val Ala Gly Thr 420 425 430 Met Lys Leu Glu Leu Ala Gln Tyr Arg Glu Val Ala Ala Phe Ala Gln 435 440 445 Phe Gly Ser Asp Leu Asp Ala Ala Thr Gln Gln Leu Leu Ser Arg Gly 450 455 460 Val Arg Leu Thr Glu Leu Leu Lys Gln Gly Gln Tyr Ser Pro Met Ala 465 470 475 480 Ile Glu Glu Gln Val Ala Val Ile Tyr Ala Gly Val Arg Gly Tyr Leu 485 490 495 Asp Lys Leu Glu Pro Ser Lys Ile Thr Lys Phe Glu Asn Ala Phe Leu 500 505 510 Ser His Val Val Ser Gln His Gln Ala Leu Leu Gly Thr Ile Arg Ala 515 520 525 Asp Gly Lys Ile Ser Glu Gln Ser Asp Ala Lys Leu Lys Glu Ile Val 530 535 540 Thr Asn Phe Leu Ala Gly Phe Glu Ala 545 550 102 6119 DNA Homo sapiens CDS (17)...(1060) 102 gaatattgga aagaag atg acc tgt cag gag ttc att gca aat ctg caa ggg 52 Met Thr Cys Gln Glu Phe Ile Ala Asn Leu Gln Gly 1 5 10 gta aat gag ggt gtt gat ttc tcc aag gat ctg ctg aaa gct ctg tac 100 Val Asn Glu Gly Val Asp Phe Ser Lys Asp Leu Leu Lys Ala Leu Tyr 15 20 25 aac tca atc aag aat gag aag ctt gaa tgg gca gta gat gat gaa gag 148 Asn Ser Ile Lys Asn Glu Lys Leu Glu Trp Ala Val Asp Asp Glu Glu 30 35 40 aaa aaa aag tct ccc tca gaa agt act gag gag aaa gct aac gga aca 196 Lys Lys Lys Ser Pro Ser Glu Ser Thr Glu Glu Lys Ala Asn Gly Thr 45 50 55 60 cat cca aag acc atc agt cgt att gga agt act act aac cca ttt ttg 244 His Pro Lys Thr Ile Ser Arg Ile Gly Ser Thr Thr Asn Pro Phe Leu 65 70 75 gac att cct cat gat cca aat gct gct gtg tac aaa agt gga ttc ttg 292 Asp Ile Pro His Asp Pro Asn Ala Ala Val Tyr Lys Ser Gly Phe Leu 80 85 90 gct cgg aaa att cat gca gat atg gat gga aag aag act cca aga gga 340 Ala Arg Lys Ile His Ala Asp Met Asp Gly Lys Lys Thr Pro Arg Gly 95 100 105 aaa cga gga tgg aaa acc ttt tat gct gta ctg aag gga aca gtt ctt 388 Lys Arg Gly Trp Lys Thr Phe Tyr Ala Val Leu Lys Gly Thr Val Leu 110 115 120 tac ttg caa aag gat gaa tac aag cca gaa aag gcc ttg tct gaa gag 436 Tyr Leu Gln Lys Asp Glu Tyr Lys Pro Glu Lys Ala Leu Ser Glu Glu 125 130 135 140 gac ttg aaa aac gct gtg agt gtg cac cac gca ttg gca tcc aag gcc 484 Asp Leu Lys Asn Ala Val Ser Val His His Ala Leu Ala Ser Lys Ala 145 150 155 acg gac tat gag aag aaa cca aac gtg ttt aaa ctt aaa act gcc gac 532 Thr Asp Tyr Glu Lys Lys Pro Asn Val Phe Lys Leu Lys Thr Ala Asp 160 165 170 tgg agg gtc ttg ctt ttt caa act cag agc cca gag gaa atg caa ggg 580 Trp Arg Val Leu Leu Phe Gln Thr Gln Ser Pro Glu Glu Met Gln Gly 175 180 185 tgg ata aac aaa atc aat tgt gtg gca gct gta ttt tct gca cca cca 628 Trp Ile Asn Lys Ile Asn Cys Val Ala Ala Val Phe Ser Ala Pro Pro 190 195 200 ttt cca gca gca atc ggc tct cag aag aag ttt agc cgc cca ctt ctg 676 Phe Pro Ala Ala Ile Gly Ser Gln Lys Lys Phe Ser Arg Pro Leu Leu 205 210 215 220 cct gcc act aca aca aaa ctg tct cag gag gag caa ctg aag tca cat 724 Pro Ala Thr Thr Thr Lys Leu Ser Gln Glu Glu Gln Leu Lys Ser His 225 230 235 gaa agt aag ctg aag cag atc acc acc gag ctg gcc gag cac cgc tca 772 Glu Ser Lys Leu Lys Gln Ile Thr Thr Glu Leu Ala Glu His Arg Ser 240 245 250 tat ccc ccc gac aag aag gtc aaa gcc aag gac gtc gat gag tac aaa 820 Tyr Pro Pro Asp Lys Lys Val Lys Ala Lys Asp Val Asp Glu Tyr Lys 255 260 265 ctg aaa gac cac tat ctg gag ttt gag aaa acc cgc tat gaa atg tat 868 Leu Lys Asp His Tyr Leu Glu Phe Glu Lys Thr Arg Tyr Glu Met Tyr 270 275 280 gtc agc att ctc aag gaa gga ggc aaa gag cta ctg agt aac gat gaa 916 Val Ser Ile Leu Lys Glu Gly Gly Lys Glu Leu Leu Ser Asn Asp Glu 285 290 295 300 agc gag gct gca gga ctg aag aag tcg cac tcg agt cct tcg ctg aac 964 Ser Glu Ala Ala Gly Leu Lys Lys Ser His Ser Ser Pro Ser Leu Asn 305 310 315 ccg gat act tct cca atc act gcc aaa gtc aag cgt aac gtg tca gag 1012 Pro Asp Thr Ser Pro Ile Thr Ala Lys Val Lys Arg Asn Val Ser Glu 320 325 330 agg aag gat cac cga cct gaa aca cca agc att aag caa aaa gtt act 1060 Arg Lys Asp His Arg Pro Glu Thr Pro Ser Ile Lys Gln Lys Val Thr 335 340 345 tagagtccat ctgcggccag gaagtgctgg tcatggagca aaatagggtt tttcaagatc 1120 tttctggtaa tccgtgaata tatttaaaaa aaaaaagtct gtgacaaaac ggtgcattag 1180 taattttttc tattgtatat ttttgttagt ttctgtacag attgtctttg ctcttgattt 1240 cttttgcttt gatgattttt gcaacttgat agctaatgca ccttttctgt gaggaggagg 1300 ggatcgtgat ttcagaatga attatgtatc ccttctcttt tggttttctc ttgtttgcag 1360 tctgctcagt tgttttatgt attctcatat caactgttaa actttttttt aaggttaaag 1420 aatttaatcc attgtgaaac acttaactgg acaaactgta gttttagtaa attctagctg 1480 gagttaatat acgcctttat atgtgaaatc ttgcccagtc acagaggtag aattgagcac 1540 tcacagatgc tccagtaaga atcacagtgc tgggaatcta gttgctccaa tatgaggcag 1600 cttcatgtgc agcttagcac ttgttgttga gatcggaccc tgctggaagc agggaaaaga 1660 agcgtgaaga tcgtaggatt gagaacttag ggaagcacat tagcttgctt gaagtgctga 1720 ttccatttca gccaagcaag ggaaagagga agtggagtca ttttgccttt gaaggctgag 1780 gaaagattga tacccagtta attttgtttg ctaaaggatg ggggcaataa tcggcccttg 1840 aggagctgca gcagtaggca tgtgctcagt ctgcaggaat tgttacctca ctcccacagg 1900 gtctagacta gaaatccatc atctctatcg ttgatatcct tccatccagg aatagatttt 1960 tcttactcta catatgtgtg tgtgcgtgcg tgtgtgtgcg tgtgtgggca tggggttgtg 2020 tcctggttgt gatattgagg tcttccttcc taacaaatta atactaaaat gaaacagctt 2080 ttcttgtgtc cttaagacaa aataaggaag gaaaacgtag ctgcagttgt ccacgatgga 2140 tattggttct ttaaaatata tctgaaagta gtagtcagaa tgaattatgg ttggaaaact 2200 gaggaatctt ctggttgcag gtgcaaagtg actttgttta ttcttgtctc agtctccttg 2260 atagccactt cactctgcta ctactcaact ttctcctaaa aatacttcat ctattttcag 2320 tcctttcttt ctgtctactc aaaatggttc tattaacttt gcagtcatga gcttgttcca 2380 gttacagtcc ctttgaagtt cagggtgata aacagaatat tcttctgtag aggaagagaa 2440 aggagtgaaa gtttagccca ctgagaccta gagctttgtg atttcctaac cttgaaactc 2500 tgtaatccct aaagttaaaa tctccgcaag tggcacaact tcagaactaa tagtatcact 2560 ttgatttttc tttttcctcc cttagaaagt ttctctagtt ctatagttta tttgttgaag 2620 gtactatgac caaagaatca gctgctctac aggaatagca tggttccagt gaattagaga 2680 aaacctgctg taaagccatg gtagtgtcta agtggtatgt tattatgatg tactagcatt 2740 tatttacaga attatttatt aacgtttact tccttcccct ctgtaaatgt ccatgactat 2800 tgcccagaga aggcttaccc ctctctaggg ttgcagttgc tttctttgta ataagtattt 2860 tgccacacct gtaaaaaaaa aaacctcact tttaactctc tgccttgttt gggtaaaggc 2920 agtaactaag tttatgtttc agaactgcaa aacaaacagg atagttacca atatggccca 2980 tgtatcagat tgatttttgt agcctctcac tgaatccaac atatccacaa gcaagttatc 3040 tgtctttcta cctgataatc taaattatca ggatatttgt tttctgccta aatgtttata 3100 ctaagccgag gggagagagg tacctagacc atgtcatcta caagcttcag taactaaaga 3160 aaaaggaact tccctgagtg gcttgaatgt gtttgcccac agtctatatc tatgtatata 3220 gaatgtctgt atgtatttta cttatttaat atacattgaa tggtaccttg ctacagtatt 3280 tctgacattt agagtagtgt tgaaatactc ggctagcatc agcaccacta tagcactgtc 3340 cgtgtcatat gagtcactaa tattaactcc agggacttct ggataggcta atagatcatt 3400 ggatacgaag ggctcttttg aagcttcagt ataccatgtt tgcatagttt atctttaaaa 3460 acaactttaa aggttctttt gtgagccagg atctcagact gccgtagcat gatgctgtcc 3520 atctttagcg catgggctga gaacacctct tccctgaggc ttctgaaggt tgctgtctgt 3580 catgagtgca tgaaggaggc caagagttta tgctatggga ggaaacagtc actgatttgc 3640 ctagattctg agagtctggc ccatagccaa ccacattttc ctttgggata atttatttcc 3700 tgtggcatct agccagaaga aattgaggat gtttcctttc acagctgctc caagcctgtt 3760 gcccaattca cggtacaagg gagcacccct tccctttcct ctgaaggtac gccacccacc 3820 tccgtcgccc acctcagcgc ccaggagcct tgggacttcc ttccatatga taaatcattc 3880 ttcttcacgt caatacactt catattaatt tctagtacag aaaatcttga cagctatcag 3940 aatgccttgg tcatagtgtt gttgcaaaat tgaccataca ggtggcccat gtataaaatc 4000 tgaattttag gggtttgtcc ccacctcgca tgctggcttt tacagggagg tgtctgggat 4060 tcctcattag caatcaaaac ttaattactg ggatgcagag tccttacttt atcgccagcc 4120 cgtaggcatt tctgaagtgc acttttttga aacatcattt tgctaactct cagcagtgtc 4180 taattaaact gagcaatact tttgtgaatt ttaattaatc tcagcaaaac catgatggga 4240 gagagtcctc tgatggaaat gtagtccctg gattatgtgt aaccttttta ttcctcttag 4300 atgcagagga tagaaagcat tttttggtgc agtggtcttg tggcaaacac aagaccctct 4360 atgcgtctcc aactgttatc ctaatctaga aaatgaggac tggcccctgg gcaaaagtga 4420 catgaggaat ttactctgga agaggaaaat ctgggtggct ttccaaggct aagataggtt 4480 tgtatttcac cctgtggcca agctacagaa cttctgagat tgtggaagaa tttttgcaac 4540 cagcagggaa agaggcctct tactgcctaa acacaaagtt acactgagct tttctactgt 4600 cctttgccta ttgctccctc tatcatgtaa agatctggga aggatgagag gcagggcctg 4660 cttgtcatga gctgcactct tttcttttta actaatcatt gacaattgga agaaaattga 4720 cgttaaagaa gtttctccat tgtcttacta acaaaacctt ttgggtttca ttaattgtcc 4780 ttgaaattga gttcctttgg catttttcct tgcagtcatc agttaagcat gttgcatcct 4840 gaattcacag aagtttagct ttgcaggttt gaatctctgt aatttaactc ccgtggactt 4900 ggtcgagttt tcagcaggtt gggagccacc tctcttcatt tcagcagtga gtcatccctt 4960 gacttttcaa atgacagaat tttttccaat tgtaaaatta gcactgtaaa acaaagaacc 5020 aaagtggcat cctaagagtt gttaaacctg aagtctagtt tatgaggaat tgtccaagtt 5080 ggagtttaaa tagtatctgc ttttgtctca aagcatctaa gttattctga cagaaaatgg 5140 taagtcagct ttgcaggcag atgcgcctct gggcctccta ccttgctcca cagctttctg 5200 gccatcttgt ctcccaggcc atgccactgc tctgccacat gtcagcaaat ttctttccac 5260 cagtcttata gcatcttaca tgatcaaatc atcacagaat aaccccgtga tagattattg 5320 atagcaatag agaggggctt tgtcactgat ttttctctca gattcctttt ccatctctca 5380 tccataaagg aaggactgaa atccaaaggc attctccttt tgtacctaca gtatccagaa 5440 cccacgtggg cagccttctg cttatgacaa taattggccc attgcatgca gagagaatgt 5500 cttcatagag agaatgtcat taaatacttg aatctgcatg acagtttgac ttgaatgcaa 5560 cagcaggaaa attttgcaag ttacataatt gtatatacag taggttttct taagtctctt 5620 cggttcatcc tttgtaattt gtgtgtgtat ctgtagtatt gcaggctttt ggagactatt 5680 cttacaggca gtatgtcagt catcaaagaa aatgctgtca cctgccattg ttgtatttgt 5740 gggtatttat agttgtatgt atgtaaatgc atcagtgtgt agattgcata tcagtgtatg 5800 gtacatgtac atcaaaatta tttttgtcct taatcagtgt gatatgaaaa gcaagtacaa 5860 cctcatagga ctgattatat aatgaagttg ttgagagtat atatagtggt attgttttat 5920 taaacttaaa ctcccataat attttgattc acattgttaa taagacttta tgctagaaaa 5980 ttctttgagc tttgaatcac cagggcaaaa atgactatca actaaccttg tgaatctttt 6040 gcagtgtact gtgtgcaata ccaagggcat agctccctgt aatttgggaa atacagaaag 6100 aaaagaaaaa aaaaaaaaa 6119 103 348 PRT Homo sapiens 103 Met Thr Cys Gln Glu Phe Ile Ala Asn Leu Gln Gly Val Asn Glu Gly 1 5 10 15 Val Asp Phe Ser Lys Asp Leu Leu Lys Ala Leu Tyr Asn Ser Ile Lys 20 25 30 Asn Glu Lys Leu Glu Trp Ala Val Asp Asp Glu Glu Lys Lys Lys Ser 35 40 45 Pro Ser Glu Ser Thr Glu Glu Lys Ala Asn Gly Thr His Pro Lys Thr 50 55 60 Ile Ser Arg Ile Gly Ser Thr Thr Asn Pro Phe Leu Asp Ile Pro His 65 70 75 80 Asp Pro Asn Ala Ala Val Tyr Lys Ser Gly Phe Leu Ala Arg Lys Ile 85 90 95 His Ala Asp Met Asp Gly Lys Lys Thr Pro Arg Gly Lys Arg Gly Trp 100 105 110 Lys Thr Phe Tyr Ala Val Leu Lys Gly Thr Val Leu Tyr Leu Gln Lys 115 120 125 Asp Glu Tyr Lys Pro Glu Lys Ala Leu Ser Glu Glu Asp Leu Lys Asn 130 135 140 Ala Val Ser Val His His Ala Leu Ala Ser Lys Ala Thr Asp Tyr Glu 145 150 155 160 Lys Lys Pro Asn Val Phe Lys Leu Lys Thr Ala Asp Trp Arg Val Leu 165 170 175 Leu Phe Gln Thr Gln Ser Pro Glu Glu Met Gln Gly Trp Ile Asn Lys 180 185 190 Ile Asn Cys Val Ala Ala Val Phe Ser Ala Pro Pro Phe Pro Ala Ala 195 200 205 Ile Gly Ser Gln Lys Lys Phe Ser Arg Pro Leu Leu Pro Ala Thr Thr 210 215 220 Thr Lys Leu Ser Gln Glu Glu Gln Leu Lys Ser His Glu Ser Lys Leu 225 230 235 240 Lys Gln Ile Thr Thr Glu Leu Ala Glu His Arg Ser Tyr Pro Pro Asp 245 250 255 Lys Lys Val Lys Ala Lys Asp Val Asp Glu Tyr Lys Leu Lys Asp His 260 265 270 Tyr Leu Glu Phe Glu Lys Thr Arg Tyr Glu Met Tyr Val Ser Ile Leu 275 280 285 Lys Glu Gly Gly Lys Glu Leu Leu Ser Asn Asp Glu Ser Glu Ala Ala 290 295 300 Gly Leu Lys Lys Ser His Ser Ser Pro Ser Leu Asn Pro Asp Thr Ser 305 310 315 320 Pro Ile Thr Ala Lys Val Lys Arg Asn Val Ser Glu Arg Lys Asp His 325 330 335 Arg Pro Glu Thr Pro Ser Ile Lys Gln Lys Val Thr 340 345 104 2029 DNA Homo sapiens CDS (211)...(1926) 104 actcagtgtt cgcgggagcc gcacctacac cagccaaccc agatcccgag gtccgacagc 60 gcccggccca gatccccacg cctgccagga gcaagccgag agccagccgg ccggcgcact 120 ccgactccga gcagtctctg tccttcgacc cgagccccgc gccctttccg ggacccctgc 180 cccgcgggca gcgctgccaa cctgccggcc atg gag acc ccg tcc cag cgg cgc 234 Met Glu Thr Pro Ser Gln Arg Arg 1 5 gcc acc cgc agc ggg gcg cag gcc agc tcc act ccg ctg tcg ccc acc 282 Ala Thr Arg Ser Gly Ala Gln Ala Ser Ser Thr Pro Leu Ser Pro Thr 10 15 20 cgc atc acc cgg ctg cag gag aag gag gac ctg cag gag ctc aat gat 330 Arg Ile Thr Arg Leu Gln Glu Lys Glu Asp Leu Gln Glu Leu Asn Asp 25 30 35 40 cgc ttg gcg gtc tac atc gac cgt gtg cgc tcg ctg gaa acg gag aac 378 Arg Leu Ala Val Tyr Ile Asp Arg Val Arg Ser Leu Glu Thr Glu Asn 45 50 55 gca ggg ctg cgc ctt cgc atc acc gag tct gaa gag gtg gtc agc cgc 426 Ala Gly Leu Arg Leu Arg Ile Thr Glu Ser Glu Glu Val Val Ser Arg 60 65 70 gag gtg tcc ggc atc aag gcc gcc tac gag gcc gag ctc ggg gat gcc 474 Glu Val Ser Gly Ile Lys Ala Ala Tyr Glu Ala Glu Leu Gly Asp Ala 75 80 85 cgc aag acc ctt gac tca gta gcc aag gag cgc gcc cgc ctg cag ctg 522 Arg Lys Thr Leu Asp Ser Val Ala Lys Glu Arg Ala Arg Leu Gln Leu 90 95 100 gag ctg agc aaa gtg cgt gag gag ttt aag gag ctg aaa gcg cgc aat 570 Glu Leu Ser Lys Val Arg Glu Glu Phe Lys Glu Leu Lys Ala Arg Asn 105 110 115 120 acc aag aag gag ggt gac ctg ata gct gct cag gct cgg ctg aag gac 618 Thr Lys Lys Glu Gly Asp Leu Ile Ala Ala Gln Ala Arg Leu Lys Asp 125 130 135 ctg gag gct ctg ctg aac tcc aag gag gcc gca ctg agc act gct ctc 666 Leu Glu Ala Leu Leu Asn Ser Lys Glu Ala Ala Leu Ser Thr Ala Leu 140 145 150 agt gag aag cgc acg ctg gag ggc gag ctg cat gat ctg cgg ggc cag 714 Ser Glu Lys Arg Thr Leu Glu Gly Glu Leu His Asp Leu Arg Gly Gln 155 160 165 gtg gcc aag ctt gag gca gcc cta ggt gag gcc aag aag caa ctt cag 762 Val Ala Lys Leu Glu Ala Ala Leu Gly Glu Ala Lys Lys Gln Leu Gln 170 175 180 gat gag atg ctg cgg cgg gtg gat gct gag aac agg ctg cag acc atg 810 Asp Glu Met Leu Arg Arg Val Asp Ala Glu Asn Arg Leu Gln Thr Met 185 190 195 200 aag gag gaa ctg gac ttc cag aag aac atc tac agt gag gag ctg cgt 858 Lys Glu Glu Leu Asp Phe Gln Lys Asn Ile Tyr Ser Glu Glu Leu Arg 205 210 215 gag acc aag cgc cgt cat gag acc cga ctg gtg gag att gac aat ggg 906 Glu Thr Lys Arg Arg His Glu Thr Arg Leu Val Glu Ile Asp Asn Gly 220 225 230 aag cag cgt gag ttt gag agc cgg ctg gcg gat gcg ctg cag gaa ctg 954 Lys Gln Arg Glu Phe Glu Ser Arg Leu Ala Asp Ala Leu Gln Glu Leu 235 240 245 cgg gcc cag cat gag gac cag gtg gag cag tat aag aag gag ctg gag 1002 Arg Ala Gln His Glu Asp Gln Val Glu Gln Tyr Lys Lys Glu Leu Glu 250 255 260 aag act tat tct gcc aag ctg gac aat gcc agg cag tct gct gag agg 1050 Lys Thr Tyr Ser Ala Lys Leu Asp Asn Ala Arg Gln Ser Ala Glu Arg 265 270 275 280 aac agc aac ctg gtg ggg gct gcc cac gag gag ctg cag cag tcg cgc 1098 Asn Ser Asn Leu Val Gly Ala Ala His Glu Glu Leu Gln Gln Ser Arg 285 290 295 atc cgc atc gac agc ctc tct gcc cag ctc agc cag ctc cag aag cag 1146 Ile Arg Ile Asp Ser Leu Ser Ala Gln Leu Ser Gln Leu Gln Lys Gln 300 305 310 ctg gca gcc aag gag gcg aag ctt cga gac ctg gag gac tca ctg gcc 1194 Leu Ala Ala Lys Glu Ala Lys Leu Arg Asp Leu Glu Asp Ser Leu Ala 315 320 325 cgt gag cgg gac acc agc cgg cgg ctg ctg gcg gaa aag gag cgg gag 1242 Arg Glu Arg Asp Thr Ser Arg Arg Leu Leu Ala Glu Lys Glu Arg Glu 330 335 340 atg gcc gag atg cgg gca agg atg cag cag cag ctg gac gag tac cag 1290 Met Ala Glu Met Arg Ala Arg Met Gln Gln Gln Leu Asp Glu Tyr Gln 345 350 355 360 gag ctt ctg gac atc aag ctg gcc ctg gac atg gag atc cac gcc tac 1338 Glu Leu Leu Asp Ile Lys Leu Ala Leu Asp Met Glu Ile His Ala Tyr 365 370 375 cgc aag ctc ttg gag ggc gag gag gag agg cta cgc ctg tcc ccc agc 1386 Arg Lys Leu Leu Glu Gly Glu Glu Glu Arg Leu Arg Leu Ser Pro Ser 380 385 390 cct acc tcg cag cgc agc cgt ggc cgt gct tcc tct cac tca tcc cag 1434 Pro Thr Ser Gln Arg Ser Arg Gly Arg Ala Ser Ser His Ser Ser Gln 395 400 405 aca cag ggt ggg ggc agc gtc acc aaa aag cgc aaa ctg gag tcc act 1482 Thr Gln Gly Gly Gly Ser Val Thr Lys Lys Arg Lys Leu Glu Ser Thr 410 415 420 gag agc cgc agc agc ttc tca cag cac gca cgc act agc ggg cgc gtg 1530 Glu Ser Arg Ser Ser Phe Ser Gln His Ala Arg Thr Ser Gly Arg Val 425 430 435 440 gcc gtg gag gag gtg gat gag gag ggc aag ttt gtc cgg ctg cgc aac 1578 Ala Val Glu Glu Val Asp Glu Glu Gly Lys Phe Val Arg Leu Arg Asn 445 450 455 aag tcc aat gag gac cag tcc atg ggc aat tgg cag atc aag cgc cag 1626 Lys Ser Asn Glu Asp Gln Ser Met Gly Asn Trp Gln Ile Lys Arg Gln 460 465 470 aat gga gat gat ccc ttg ctg act tac cgg ttc cca cca aag ttc acc 1674 Asn Gly Asp Asp Pro Leu Leu Thr Tyr Arg Phe Pro Pro Lys Phe Thr 475 480 485 ctg aag gct ggg cag gtg gtg acg atc tgg gct gca gga gct ggg gcc 1722 Leu Lys Ala Gly Gln Val Val Thr Ile Trp Ala Ala Gly Ala Gly Ala 490 495 500 acc cac agc ccc cct acc gac ctg gtg tgg aag gca cag aac acc tgg 1770 Thr His Ser Pro Pro Thr Asp Leu Val Trp Lys Ala Gln Asn Thr Trp 505 510 515 520 ggc tgc ggg aac agc ctg cgt acg gct ctc atc aac tcc act ggg gaa 1818 Gly Cys Gly Asn Ser Leu Arg Thr Ala Leu Ile Asn Ser Thr Gly Glu 525 530 535 gaa gtg gcc atg cgc aag ctg gtg cgc tca gtg act gtg gtt gag gac 1866 Glu Val Ala Met Arg Lys Leu Val Arg Ser Val Thr Val Val Glu Asp 540 545 550 gac gag gat gag gat gga gat gac ctg ctc cat cac cac cat gtg agt 1914 Asp Glu Asp Glu Asp Gly Asp Asp Leu Leu His His His His Val Ser 555 560 565 ggt agc cgc cgc tgaggccgag cctgcactgg ggccacccag ccaggcctgg 1966 Gly Ser Arg Arg 570 gggcagcctc tccccagcct ccccgtgcca aaaatctttt cattaaagaa tgtttggaac 2026 ttt 2029 105 572 PRT Drosophila melanogaster 105 Met Glu Thr Pro Ser Gln Arg Arg Ala Thr Arg Ser Gly Ala Gln Ala 1 5 10 15 Ser Ser Thr Pro Leu Ser Pro Thr Arg Ile Thr Arg Leu Gln Glu Lys 20 25 30 Glu Asp Leu Gln Glu Leu Asn Asp Arg Leu Ala Val Tyr Ile Asp Arg 35 40 45 Val Arg Ser Leu Glu Thr Glu Asn Ala Gly Leu Arg Leu Arg Ile Thr 50 55 60 Glu Ser Glu Glu Val Val Ser Arg Glu Val Ser Gly Ile Lys Ala Ala 65 70 75 80 Tyr Glu Ala Glu Leu Gly Asp Ala Arg Lys Thr Leu Asp Ser Val Ala 85 90 95 Lys Glu Arg Ala Arg Leu Gln Leu Glu Leu Ser Lys Val Arg Glu Glu 100 105 110 Phe Lys Glu Leu Lys Ala Arg Asn Thr Lys Lys Glu Gly Asp Leu Ile 115 120 125 Ala Ala Gln Ala Arg Leu Lys Asp Leu Glu Ala Leu Leu Asn Ser Lys 130 135 140 Glu Ala Ala Leu Ser Thr Ala Leu Ser Glu Lys Arg Thr Leu Glu Gly 145 150 155 160 Glu Leu His Asp Leu Arg Gly Gln Val Ala Lys Leu Glu Ala Ala Leu 165 170 175 Gly Glu Ala Lys Lys Gln Leu Gln Asp Glu Met Leu Arg Arg Val Asp 180 185 190 Ala Glu Asn Arg Leu Gln Thr Met Lys Glu Glu Leu Asp Phe Gln Lys 195 200 205 Asn Ile Tyr Ser Glu Glu Leu Arg Glu Thr Lys Arg Arg His Glu Thr 210 215 220 Arg Leu Val Glu Ile Asp Asn Gly Lys Gln Arg Glu Phe Glu Ser Arg 225 230 235 240 Leu Ala Asp Ala Leu Gln Glu Leu Arg Ala Gln His Glu Asp Gln Val 245 250 255 Glu Gln Tyr Lys Lys Glu Leu Glu Lys Thr Tyr Ser Ala Lys Leu Asp 260 265 270 Asn Ala Arg Gln Ser Ala Glu Arg Asn Ser Asn Leu Val Gly Ala Ala 275 280 285 His Glu Glu Leu Gln Gln Ser Arg Ile Arg Ile Asp Ser Leu Ser Ala 290 295 300 Gln Leu Ser Gln Leu Gln Lys Gln Leu Ala Ala Lys Glu Ala Lys Leu 305 310 315 320 Arg Asp Leu Glu Asp Ser Leu Ala Arg Glu Arg Asp Thr Ser Arg Arg 325 330 335 Leu Leu Ala Glu Lys Glu Arg Glu Met Ala Glu Met Arg Ala Arg Met 340 345 350 Gln Gln Gln Leu Asp Glu Tyr Gln Glu Leu Leu Asp Ile Lys Leu Ala 355 360 365 Leu Asp Met Glu Ile His Ala Tyr Arg Lys Leu Leu Glu Gly Glu Glu 370 375 380 Glu Arg Leu Arg Leu Ser Pro Ser Pro Thr Ser Gln Arg Ser Arg Gly 385 390 395 400 Arg Ala Ser Ser His Ser Ser Gln Thr Gln Gly Gly Gly Ser Val Thr 405 410 415 Lys Lys Arg Lys Leu Glu Ser Thr Glu Ser Arg Ser Ser Phe Ser Gln 420 425 430 His Ala Arg Thr Ser Gly Arg Val Ala Val Glu Glu Val Asp Glu Glu 435 440 445 Gly Lys Phe Val Arg Leu Arg Asn Lys Ser Asn Glu Asp Gln Ser Met 450 455 460 Gly Asn Trp Gln Ile Lys Arg Gln Asn Gly Asp Asp Pro Leu Leu Thr 465 470 475 480 Tyr Arg Phe Pro Pro Lys Phe Thr Leu Lys Ala Gly Gln Val Val Thr 485 490 495 Ile Trp Ala Ala Gly Ala Gly Ala Thr His Ser Pro Pro Thr Asp Leu 500 505 510 Val Trp Lys Ala Gln Asn Thr Trp Gly Cys Gly Asn Ser Leu Arg Thr 515 520 525 Ala Leu Ile Asn Ser Thr Gly Glu Glu Val Ala Met Arg Lys Leu Val 530 535 540 Arg Ser Val Thr Val Val Glu Asp Asp Glu Asp Glu Asp Gly Asp Asp 545 550 555 560 Leu Leu His His His His Val Ser Gly Ser Arg Arg 565 570 

We claim:
 1. A method of identifying a compound that modulates ADHD in a mammal, comprising: (a) administering a test compound to an invertebrate; and (b) measuring a foraging behavior of said invertebrate, wherein a compound that modulates the foraging behavior of said invertebrate is characterized as a compound that modulates ADHD in a mammal.
 2. The method of claim 1, wherein said compound reduces ADHD in a mammal.
 3. The method of claim 1, wherein said foraging behavior comprises a phenotype of the for gene.
 4. The method of claim 1, wherein said compound increases the foraging behavior of a sitter.
 5. The method of claim 4, wherein said compound reduces ADHD in a mammal.
 6. The method of claim 1, wherein said compound decreases the foraging behavior of a Rover.
 7. The method of claim 6, wherein said compound reduces ADHD in a mammal.
 8. The method of claim 1, wherein said compound modulates the expression or activity of one or more mammalian polypeptides.
 9. The method of claim 1, wherein said invertebrate is an insect.
 10. The method of claim 1, wherein said invertebrate is Drosophila melanogaster.
 11. The method of claim 10, wherein said Drosophila melanogaster is an adult.
 12. The method of claim 10, wherein said Drosophila melanogaster is a larva.
 13. The method of claim 1, wherein said mammal is human.
 14. A method of identifying a compound that modulates hypertension in a mammal, comprising: (a) administering a test compound to an invertebrate; and (b) measuring a foraging behavior of said invertebrate, wherein a compound that modulates the foraging behavior of said invertebrate is characterized as a compound that modulates hypertension in a mammal.
 15. The method of claim 14, wherein said compound reduces hypertension in a mammal.
 16. The method of claim 14, wherein said compound increases the foraging behavior of a sitter.
 17. The method of claim 16, wherein said compound reduces hypertension in a mammal.
 18. The method of claim 14, wherein said foraging behavior comprises a phenotype of the for gene.
 19. The method of claim 14, wherein said compound decreases the foraging behavior of a Rover.
 20. The method of claim 19, wherein said compound reduces hypertension in a mammal.
 21. The method of claim 14, wherein said compound modulates the expression or activity of one or more mammalian polypeptides.
 22. The method of claim 14, wherein said invertebrate is an insect.
 23. The method of claim 14, wherein said invertebrate is Drosophila melanogaster.
 24. The method of claim 23, wherein said Drosophila melanogaster is an adult.
 25. The method of claim 23, wherein said Drosophila melanogaster is a larva.
 26. The method of claim 14, wherein said mammal is human.
 27. A method of identifying a compound that modulates ADHD in a mammal, comprising: (a) administering a test compound to an invertebrate; (b) measuring an expression level for one or more polynucleotides in said invertebrate; and (c) comparing said expression level for one or more polynucleotides in said invertebrate to an expression level of one or more polynucleotides in a reference invertebrate, a compound having the effect of modulating said expression level of one or more polynucleotides associated with invertebrate foraging behavior in said test invertebrate relative to said reference invertebrate is identified as a compound that modulates ADHD in a mammal.
 28. The method of claim 27, wherein said compound reduces ADHD in a mammal.
 29. The method of claim 27, wherein said foraging behavior comprises a phenotype of the for gene.
 30. The method of claim 27, wherein said compound increases the foraging behavior of a sitter.
 31. The method of claim 30, wherein said compound decreases severity of ADHD in a mammal.
 32. The method of claim 27, wherein said compound decreases the foraging behavior of a Rover.
 33. The method of claim 32, wherein said compound decreases severity of ADHD in a mammal.
 34. The method of claim 27, wherein a compound that has the effect of increasing expression of one or more polynucleotides in said invertebrate relative to said reference invertebrate is identified as a compound that decreases severity of ADHD.
 35. The method of claim 27, wherein a compound that has the effect of decreasing expression of a specific polynucleotide in said invertebrate relative to said reference invertebrate is identified as a compound that decreases a symptom of ADHD.
 36. The method of claim 27, further comprising step (d) measuring an expression level for one or more polynucleotide in said reference invertebrate, wherein said test compound is administered to said reference invertebrate.
 37. The method of claim 27, wherein said invertebrate exhibits substantially the same foraging behavior as said reference invertebrate before said compound is administered.
 38. The method of claim 27, wherein said invertebrate exhibits a different foraging behavior from said reference invertebrate before said compound is administered.
 39. The method of claim 27, wherein said invertebrate exhibits substantially the same foraging behavior as said reference invertebrate after said compound is administered.
 40. The method of claim 27, wherein said invertebrate exhibits a different foraging behavior from said reference invertebrate after said compound is administered.
 41. The method of claim 27, wherein said invertebrate is an insect.
 42. The method of claim 27, wherein said invertebrate is Drosophila melanogaster.
 43. The method of claim 42, wherein said Drosophila melanogaster is an adult.
 44. The method of claim 42, wherein said Drosophila melanogaster is a larva.
 45. The method of claim 27, wherein said mammal is human.
 46. The method of claim 27, wherein said one or more polynucleotides having modulated expression levels are selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 47. The method of claim 46, wherein a mammalian polynucleotide comprises a nucleic acid sequence substantially the same as said one or more polynucleotides comprising said sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 48. A method of identifying a compound that modulates hypertension in a mammal, comprising: (a) administering a test compound to an invertebrate; (b) measuring an expression level of one or more polynucleotides in said invertebrate; and (c) comparing said expression level of one or more polynucleotides in said invertebrate to an expression level of one or more polynucleotides in a reference invertebrate, a compound having the effect of modulating said expression level of one or more polynucleotides associated with invertebrate foraging behavior in said test invertebrate relative to said reference invertebrate is identified as a compound that modulates hypertension in a mammal.
 49. The method of claim 48, wherein said compound reduces hypertension in a mammal.
 50. The method of claim 48, wherein said foraging behavior comprises a phenotype of the for gene.
 51. The method of claim 48, wherein said compound increases the foraging behavior of a sitter.
 52. The method of claim 48, wherein said compound decreases severity of hypertension in a mammal.
 53. The method of claim 48, wherein said compound decreases the foraging behavior of a Rover.
 54. The method of claim 53, wherein said compound decreases severity of hypertension in a mammal.
 55. The method of claim 48, wherein a compound that has the effect of increasing expression of one or more polynucleotides in said invertebrate relative to said reference invertebrate is identified as a compound that decreases severity of ADHD.
 56. The method of claim 48, wherein a compound that has the effect of decreasing expression of a specific polynucleotide in said test invertebrate relative to said reference invertebrate is characterized as a compound that decreases a symptom of ADHD.
 57. The method of claim 48, further comprising step (d) measuring an expression level for one or more polynucleotide in said reference invertebrate, wherein said test compound is administered to said reference invertebrate.
 58. The method of claim 48, wherein said invertebrate exhibits substantially the same foraging behavior as said reference invertebrate before said compound is administered.
 59. The method of claim 48, wherein said invertebrate exhibits a different foraging behavior from said reference invertebrate before said compound is administered.
 60. The method of claim 48, wherein said invertebrate exhibits substantially the same foraging behavior as said reference invertebrate after said compound is administered.
 61. The method of claim 48, wherein said invertebrate exhibits a different foraging behavior from said reference invertebrate after said compound is administered.
 62. The method of claim 48, wherein said invertebrate is an insect.
 63. The method of claim 48, wherein said invertebrate is Drosophila melanogaster.
 64. The method of claim 63, wherein said Drosophila melanogaster is an adult.
 65. The method of claim 63, wherein said Drosophila melanogaster is a larva.
 66. The method of claim 48, wherein said mammal is human.
 67. The method of claim 48, wherein said one or more polynucleotides having modulated expression levels are selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 68. The method of claim 67, wherein a mammalian polynucleotide comprises a nucleic acid sequence substantially the same as said one or more polynucleotides comprising said sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 69. A method of identifying a polynucleotide that correlates with ADHD in a mammal, comprising: (a) obtaining a first and a second strain of an invertebrate; (b) subjecting said first and second invertebrate strains to conditions in which said first strain exhibits a foraging behavior different than a foraging behavior exhibited by said second strain; (c) measuring an expression level for one or more polynucleotides in said first and second strains, and (d) identifying one or more polynucleotides that are differentially expressed in said first strain relative to said second strain, wherein a mammalian polynucleotide comprising substantially the same nucleic acid sequence as said one or more differentially expressed polynucleotides correlates with ADHD in a mammal.
 70. The method of claim 69, wherein increased expression of said mammalian polynucleotide correlates with an increase or decrease in severity of ADHD.
 71. The method of claim 69, wherein decreased expression of said mammalian polynucleotide correlates with an increase or decrease in severity of ADHD.
 72. The method of claim 69, wherein said foraging behavior comprises a phenotype of the for gene.
 73. The method of claim 69, wherein said different behavior exhibited by said first strain is reduced foraging behavior in a Rover.
 74. The method of claim 69, wherein said different behavior exhibited by said first strain is increased foraging behavior in a sitter.
 75. The method of claim 69, wherein said mammalian polynucleotide comprises a nucleic acid sequence substantially the same as a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 76. The method of claim 69, wherein said mammalian polynucleotide encodes a polypeptide having characteristics selected from the group consisting of molecular weight of 50 kD and pI of 4.1, a molecular weight of 28 kD and pI of 8.7, a molecular weight of 36 kD and pI of 6.0, a molecular weight of 34 kD and pI of 6.3, a molecular weight of 25 kD and pI of 5.9, a molecular weight of 12 kD and pI of 5.7, a molecular weight of 12 kD and pI of 6.4, a molecular weight of 12 kD and pI of 6.4, or a molecular weight of 29 kD and pI of 6.5.
 77. The method of claim 69, wherein said one or more polynucleotides that are differentially expressed in said first strain relative to said second strain have increased expression and said mammalian polynucleotide has increased expression when involved in ADHD in a mammal.
 78. The method of claim 69, wherein said one or more polynucleotide that are differentially expressed in said first strain relative to said second strain have decreased expression and said mammalian polynucleotide has decreased expression when involved in ADHD in a mammal.
 79. A method of identifying a polynucleotide that correlates with hypertension comprising: (a) obtaining a first and a second strain of an invertebrate; (b) subjecting said first and second invertebrate strains to conditions in which said first strain exhibits a foraging behavior different than a foraging behavior exhibited by said second strain; (c) measuring an expression level for one or more polynucleotide in said first and second strains, and (d) identifying one or more polynucleotides that are differentially expressed in said first strain relative to said second strain, wherein a mammalian polynucleotide comprising substantially the same nucleic acid sequence as said one or more differentially expressed polynucleotides correlates with hypertension.
 80. The method of claim 79, wherein increased expression of said mammalian polynucleotide correlates with an increase or decrease in severity of hypertension.
 81. The method of claim 79, wherein decreased expression of said mammalian polynucleotide correlates with an increase or decrease in severity of hypertension.
 82. The method of claim 79, wherein said foraging behavior comprises a phenotype of the for gene.
 83. The method of claim 79, wherein said different behavior exhibited by said first strain is reduced foraging behavior in a Rover.
 84. The method of claim 79, wherein said different behavior exhibited by said first strain is increased foraging behavior in a sitter.
 85. The method of claim 79, wherein said mammalian polynucleotide comprises a nucleic acid sequence substantially the same as a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 86. The method of claim 79, wherein said mammalian polynucleotide encodes a polypeptide having the characteristics selected from the group consisting of molecular weight of 50 kD and pI of 4.1, a molecular weight of 28 kD and pI of 8.7, a molecular weight of 36 kD and pI of 6.0, a molecular weight of 34 kD and pI of 6.3, a molecular weight of 25 kD and pI of 5.9, a molecular weight of 12 kD and pI of 5.7, a molecular weight of 12 kD and pI of 6.4, a molecular weight of 12 kD and pI of 6.4, and a molecular weight of 29 kD and pI of 6.5.
 87. The method of claim 79, wherein said one or more polynucleotides that are differentially expressed in said first strain relative to said second strain have increased expression and said mammalian polynucleotide has increased expression when involved in hypertension in a mammal.
 88. The method of claim 79, wherein said one or more polynucleotide that are differentially expressed in said first strain relative to said second strain have decreased expression and said mammalian polynucleotide has decreased expression when involved in hypertension in a mammal.
 89. An isolated polynucleotide having ADHD-altering activity in a mammal, or fragment thereof, comprising substantially the same nucleic acid sequence as a polynucleotide selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 90. An isolated polynucleotide having hypertension-altering activity in a mammal, or fragment thereof, comprising substantially the same nucleic acid sequence as a polynucleotide selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 91. An isolated polypeptide having ADHD-altering activity in a mammal, or fragment thereof, comprising substantially the same amino acid sequence as an amino acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 92. An antibody specific for the isolated polypeptide of claim
 91. 93. The antibody of claim 92, wherein said antibody is a monoclonal antibody.
 94. The antibody of claim 92, wherein said antibody is a polyclonal antibody.
 95. An isolated polypeptide having hypertension-altering activity in a mammal, or fragment thereof, comprising substantially the same amino acid sequence as an amino acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-75, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 100, SEQ ID NO: 102 and SEQ ID NO:
 104. 96. An isolated polypeptide having ADHD-altering activity in a mammal, or fragment thereof, comprising a polypeptide having the characteristics selected from the group consisting of molecular weight of 50 kD and pI of 4.1, a molecular weight of 28 kD and pI of 8.7, a molecular weight of 36 kD and pI of 6.0, a molecular weight of 34 kD and pI of 6.3, a molecular weight of 25 kD and pI of 5.9, a molecular weight of 12 kD and pI of 5.7, a molecular weight of 12 kD and pI of 6.4, a molecular weight of 12 kD and pI of 6.4, and a molecular weight of 29 kD and pI of 6.5.
 100. An isolated polypeptide having hypertension-altering activity in a mammal, or fragment thereof, comprising a polypeptide having the characteristics selected from the group consisting of molecular weight of 50 kD and pI of 4.1, a molecular weight of 28 kD and pI of 8.7, a molecular weight of 36 kD and pI of 6.0, a molecular weight of 34 kD and pI of 6.3, a molecular weight of 25 kD and pI of 5.9, a molecular weight of 12 kD and pI of 5.7, a molecular weight of 12 kD and pI of 6.4, a molecular weight of 12 kD and pI of 6.4, and a molecular weight of 29 kD and pI of 6.5. 